Kubernetes version of low-latency multi-region tutorial with Secure Mode
Simple illustration of a multi-region application in a multi-region Kubernetes deployment!
Way back in 2019, my manager asked me to assist our docs team with a multiregional illustration on kubernetes and I wrote up this GitHub issue. However it never got to the docs, so I am going to take to a blog as I have shared this example with many interested parties and expand it with some commentary along with a followup posting using the new Cockroach DB MultiRegion commands which can be applied in the steps below. The issue highlights how to run a multi-region app in a multi-region Kubernetes deployment expanding on our multi-region tutorial featuring MovR and geo-partitioning and duplicate indexes but on a Kubernetes-based deployment. So lets break down the steps.
- Using the GKE Multi-Cluster steps, setup 9 cockroach nodes across three k8 clusters in us-east1-b, us-west1-b and us-central1-central regions on GKE.
- Launch a pod in each region containing the movr application which uses a secure connection the the local Cockroach database nodes in the same region.
- Load the movr data set from one region
- Run the movr application across all 3 regions using the Multi-regional schema optimizations for best performance and data protection within regions.
- Scale the workloads and pods to illustrate how additional nodes can be easily spun up to accommodate peaks or growth in capacity or throughput.
- Use the GKE Multi-Cluster steps to destroy the setup.
After following the steps in #1. kubectl config get-contexts displays the namespaces for each of the clusters
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
gke_cockroach-rslee_us-central1-b_cockroachdb-central gke_cockroach-rslee_us-central1-b_cockroachdb-central gke_cockroach-rslee_us-central1-b_cockroachdb-central
gke_cockroach-rslee_us-east1-b_cockroachdb-east gke_cockroach-rslee_us-east1-b_cockroachdb-east gke_cockroach-rslee_us-east1-b_cockroachdb-east
gke_cockroach-rslee_us-east1-c_cockroachdb gke_cockroach-rslee_us-east1-c_cockroachdb gke_cockroach-rslee_us-east1-c_cockroachdb
* gke_cockroach-rslee_us-west1-b_client-west gke_cockroach-rslee_us-west1-b_client-west gke_cockroach-rslee_us-west1-b_client-west
gke_cockroach-rslee_us-west1-b_cockroachdb-west gke_cockroach-rslee_us-west1-b_cockroachdb-west gke_cockroach-rslee_us-west1-b_cockroachdb-west
minikube minikube minikube
And the following DB Console illustrates the 9 nodes across 3 regions.
create a yaml for movR image with access to certs for secure access
apiVersion: v1
kind: Pod
metadata:
name: cockroachdb-movr
labels:
app: cockroachdb-client
spec:
serviceAccountName: cockroachdb
containers:
- name: cockroachdb-client
# image: cockroachdb/movr
image: cockroachdb/movr:19.03.2
imagePullPolicy: IfNotPresent
volumeMounts:
- name: client-certs
mountPath: /cockroach-certs
# Keep a pod open indefinitely so kubectl exec can be used to get a shell to it
# and run cockroach client commands, such as cockroach sql, cockroach node status, etc.
command:
- sleep
- "2147483648" # 2^31
# This pod isn't doing anything important, so don't bother waiting to terminate it.
terminationGracePeriodSeconds: 0
volumes:
- name: client-certs
secret:
secretName: cockroachdb.client.root
defaultMode: 256
create one movR client in central, east and west
kubectl create -f movr.yaml --context=gke_cockroach-rslee_us-central1-b_cockroachdb-central
kubectl create -f movr.yaml --context=gke_cockroach-rslee_us-east1-b_cockroachdb-east
kubectl create -f movr.yaml --context=gke_cockroach-rslee_us-west1-b_cockroachdb-west
Load data in West POD
kubectl exec -it cockroachdb-movr --context=gke_cockroach-rslee_us-west1-b_cockroachdb-west --python loadmovr.py --url "postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key" load --num-users 100 --num-rides 100 --num-vehicles 10
[INFO] (MainThread) connected to movr database @ postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key
Run Workload in West
kubectl exec -it cockroachdb-movr --context=gke_cockroach-rslee_us-west1-b_cockroachdb-west -- python loadmovr.py --url "postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key" --num-threads=15 run --city="seattle"
[INFO] (MainThread) connected to movr database @ postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key
[INFO] (MainThread) simulating movr load for cities ['seattle']
[INFO] (MainThread) warming up....
[INFO] (MainThread) starting load
Run Workload in Central
kubectl exec -it cockroachdb-movr --context=gke_cockroach-rslee_us-central1-b_cockroachdb-central -- python loadmovr.py --url "postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key" --num-threads=15 run --city="chicago"
[INFO] (MainThread) connected to movr database @ postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key
[INFO] (MainThread) simulating movr load for cities ['chicago']
[INFO] (MainThread) warming up....
[INFO] (MainThread) starting load
Run Workload in East
kubectl exec -it cockroachdb-movr --context=gke_cockroach-rslee_us-east1-b_cockroachdb-east -- python loadmovr.py --url "postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key" --num-threads=15 run --city="new york"
[INFO] (MainThread) connected to movr database @ postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key
\[INFO] (MainThread) connected to movr database @ postgres://root@cockroachdb-public:26257/movr?sslmode=verify-full&sslrootcert=/cockroach-certs/ca.crt&sslcert=/cockroach-certs/client.root.crt&sslkey=/cockroach-certs/client.root.key
[INFO] (MainThread) simulating movr load for cities ['new york']
[INFO] (MainThread) warming up....
[INFO] (MainThread) starting load
The movr SQL latences above would be determined by reads which might occur accros regions or writes which require writes in 2 of the 3 datacenters as CRDB writes require a qourom of replicas to be in agreement. To further optimize performance for reads and writes, we will reconfigure the tables to reduce latency using geo-partitioning feature.
The commands below set the replicas for all ranges in a partition to a specific region thus reducing the read and write penalty. For example, the users table is partitioned by the city identifier. Specifically, rows containing the value “new york” fall in the partition PARTITION new_york
and rows containing value “chicago” fall in the partition PARTITION chicago
. The partitions data are stored in one or more ranges with 3 replicas/copies. using the Zone Configuration command ALTER PARTITION new_york OF TABLE movr.users CONFIGURE ZONE USING constraints='[+zone=us-east1-b]'
sets place of the data on nodes which are located in +zone=us-east1-b
in GKE. This ensure data locality for “new york” users, vehicles and rides for the users in new york who will run queries in the us-east1 location and achieve low latency reads and writes. We have moved from protecting data for rows across regions to just protecting data across nodes in a region for rows in the their home region. Also, Zone configuration allows for cross region data protection as opposed to just local regional protection to ensure data protection even when all the nodes in a region goes down and east users can access their data for reads and writes from last know committed state in the central or west regions. The cross region protection is simply invoked by specifying the number of replicas and their locations. Exampe ALTER PARTITION new_york OF TABLE movr.users CONFIGURE ZONE USING constraints='[+zone=us-east1-b:1, +zone=us-west1-b:1,+zone=us-central1-b ]'
ALTER TABLE users PARTITION BY LIST (city) ( PARTITION new_york VALUES IN ('new york'), PARTITION chicago VALUES IN ('chicago'), PARTITION seattle VALUES IN ('seattle') );
ALTER TABLE vehicles PARTITION BY LIST (city) ( PARTITION new_york VALUES IN ('new york'), PARTITION chicago VALUES IN ('chicago'), PARTITION seattle VALUES IN ('seattle') );
ALTER INDEX vehicles_auto_index_fk_city_ref_users PARTITION BY LIST (city) ( PARTITION new_york_idx VALUES IN ('new york'), PARTITION chicago_idx VALUES IN ('chicago'), PARTITION seattle_idx VALUES IN ('seattle') );
ALTER TABLE rides PARTITION BY LIST (city) ( PARTITION new_york VALUES IN ('new york'), PARTITION chicago VALUES IN ('chicago'), PARTITION seattle VALUES IN ('seattle') );
ALTER INDEX rides_auto_index_fk_city_ref_users PARTITION BY LIST (city) ( PARTITION new_york_idx1 VALUES IN ('new york'), PARTITION chicago_idx1 VALUES IN ('chicago'), PARTITION seattle_idx1 VALUES IN ('seattle') );
ALTER INDEX rides_auto_index_fk_vehicle_city_ref_vehicles PARTITION BY LIST (vehicle_city) ( PARTITION new_york_idx2 VALUES IN ('new york'), PARTITION chicago_idx2 VALUES IN ('chicago'), PARTITION seattle_idx2 VALUES IN ('seattle') );
ALTER PARTITION new_york OF TABLE movr.users CONFIGURE ZONE USING constraints='[+zone=us-east1-b]'; ALTER PARTITION chicago OF TABLE movr.users CONFIGURE ZONE USING constraints='[+zone=us-central1-b]'; ALTER PARTITION seattle OF TABLE movr.users CONFIGURE ZONE USING constraints='[+zone=us-west1-b]';
ALTER PARTITION new_york OF TABLE movr.vehicles CONFIGURE ZONE USING constraints='[+zone=us-east1-b]'; ALTER PARTITION chicago OF TABLE movr.vehicles CONFIGURE ZONE USING constraints='[+zone=us-central1-b]'; ALTER PARTITION seattle OF TABLE movr.vehicles CONFIGURE ZONE USING constraints='[+zone=us-west1-b]';
ALTER PARTITION new_york_idx OF TABLE movr.vehicles CONFIGURE ZONE USING constraints='[+zone=us-east1-b]'; ALTER PARTITION chicago_idx OF TABLE movr.vehicles CONFIGURE ZONE USING constraints='[+zone=us-central1-b]'; ALTER PARTITION seattle_idx OF TABLE movr.vehicles CONFIGURE ZONE USING constraints='[+zone=us-west1-b]';
ALTER PARTITION new_york OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-east1-b]'; ALTER PARTITION chicago OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-central1-b]'; ALTER PARTITION seattle OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-west1-b]';
ALTER PARTITION new_york_idx1 OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-east1-b]'; ALTER PARTITION chicago_idx1 OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-central1-b]'; ALTER PARTITION seattle_idx1 OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-west1-b]';
ALTER PARTITION new_york_idx2 OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-east1-b]'; ALTER PARTITION chicago_idx2 OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-central1-b]'; ALTER PARTITION seattle_idx2 OF TABLE movr.rides CONFIGURE ZONE USING constraints='[+zone=us-west1-b)';
Using Secure Client modify the schema
kubectl exec -it cockroachdb-client-secure-central --context=gke_cockroach-rslee_us-central1-b_cockroachdb-central -- ./cockroach sql --certs-dir=/cockroach-certs --host=cockroachdb-public -d movr < movr.sql
Unable to use a TTY - input is not a terminal or the right kind of file
ALTER TABLE
ALTER TABLE
ALTER INDEX
ALTER TABLE
ALTER INDEX
ALTER INDEX
CONFIGURE ZONE 1
CONFIGURE ZONE 1
CONFIGURE ZONE 1
CONFIGURE ZONE 1
CONFIGURE ZONE 1
CONFIGURE ZONE 1
Secure clients per region
client-secure-central.yaml
apiVersion: v1
kind: Pod
metadata:
name: cockroachdb-client-secure-central
labels:
app: cockroachdb-client
spec:
serviceAccountName: cockroachdb
containers:
- name: cockroachdb-client
image: cockroachdb/cockroach:v19.1.3
imagePullPolicy: IfNotPresent
volumeMounts:
- name: client-certs
mountPath: /cockroach-certs
# Keep a pod open indefinitely so kubectl exec can be used to get a shell to it
# and run cockroach client commands, such as cockroach sql, cockroach node status, etc.
command:
- sleep
- "2147483648" # 2^31
# This pod isn't doing anything important, so don't bother waiting to terminate it.
terminationGracePeriodSeconds: 0
volumes:
- name: client-certs
secret:
secretName: cockroachdb.client.root
defaultMode: 256client-secure-east.yaml
apiVersion: v1
kind: Pod
metadata:
name: cockroachdb-client-secure-east
labels:
app: cockroachdb-client
spec:
serviceAccountName: cockroachdb
containers:
- name: cockroachdb-client
image: cockroachdb/cockroach:v19.1.3
imagePullPolicy: IfNotPresent
volumeMounts:
- name: client-certs
mountPath: /cockroach-certs
# Keep a pod open indefinitely so kubectl exec can be used to get a shell to it
# and run cockroach client commands, such as cockroach sql, cockroach node status, etc.
command:
- sleep
- "2147483648" # 2^31
# This pod isn't doing anything important, so don't bother waiting to terminate it.
terminationGracePeriodSeconds: 0
volumes:
- name: client-certs
secret:
secretName: cockroachdb.client.root
defaultMode: 256
client-secure-west.yaml
apiVersion: v1
kind: Pod
metadata:
name: cockroachdb-client-secure-west
labels:
app: cockroachdb-client
spec:
serviceAccountName: cockroachdb
containers:
- name: cockroachdb-client
image: cockroachdb/cockroach:v19.1.3
imagePullPolicy: IfNotPresent
volumeMounts:
- name: client-certs
mountPath: /cockroach-certs
# Keep a pod open indefinitely so kubectl exec can be used to get a shell to it
# and run cockroach client commands, such as cockroach sql, cockroach node status, etc.
command:
- sleep
- "2147483648" # 2^31
# This pod isn't doing anything important, so don't bother waiting to terminate it.
terminationGracePeriodSeconds: 0
volumes:
- name: client-certs
secret:
secretName: cockroachdb.client.root
defaultMode: 256
The following db console depicts the database movr replica locations across the 3 regions.
The following depicts the database Zone configuration commands which control replica placement.
After all the replicas have reconfigured, we can see the throughput has increased and latency has decreased.
Lets wrap up, with the last 2 steps and this is where some of the folks I have shared the issue with before would find this useful which using this to scale test their container enviornment.
With the movr app you simple add more scale by adding more data and more concurrency tweaking the following flags
--num-threads 1 \
load \
--num-users 100 \
--num-rides 100 \
--num-vehicles 10 \
--city "boston" \
--city "new york" \
--city "washington dc" \
--city="amsterdam" \
--city="paris" \
--city="rome" \
--city="los angeles" \
--city="san francisco" \
--city="seattle"
and as you scale the workload to exceed data or cpu capacity of the cockroachdb pods, you can scale the stateful set to add more replicas
kubectl scale statefulset cockroachdb --replicas=4 --context=gke_cockroach-rslee_us-west1-b_cockroachdb-west
and observe how Cockroach rebalances data to the new nodes and decreases latency due to hardware contention.