Reads During Network Partitions
Cockroach introduced Follower reads in 2018 and Bounded staleness in 2020 which service reads from a local or follower replicas. A great use case for this is allowing point reads to a node which is isolated from the cluster during a network partition. This blog will illustrate how this works. Using a 6 node cluster with docker, we have 3 regions setup with a network bridge between each region. The following config.yaml is used to build out the cluster and we have one table which spans leader and follower ranges across the cluster.
networks:
cockroachdb-training-shared:
name: cockroachdb-training-shared
driver: bridge
cockroachdb-training-dc0:
name: cockroachdb-training-dc0
driver: bridge
cockroachdb-training-dc1:
name: cockroachdb-training-dc1
driver: bridge
cockroachdb-training-dc2:
name: cockroachdb-training-dc2
driver: bridge
services:
# DC 0 nodes
roach-0:
container_name: roach-0
hostname: roach-0
image: cockroachdb/cockroach:${COCKROACH_VERSION:-v22.2.8}
networks:
- cockroachdb-training-shared
- cockroachdb-training-dc0
command: start — logtostderr — insecure — locality=datacenter=dc-0 — join=roach-0,roach-1,roach-2
ports:
- 8080:8080
- 26257:26257
roach-1:
container_name: roach-1
hostname: roach-1
image: cockroachdb/cockroach:${COCKROACH_VERSION:-v22.2.8}
networks:
- cockroachdb-training-shared
- cockroachdb-training-dc0
command: start — logtostderr — insecure — locality=datacenter=dc-0 — join=roach-0,roach-1,roach-2
ports:
- 8081:8080
- 26258:26257
# DC 1 nodes
roach-2:
container_name: roach-2
hostname: roach-2
image: cockroachdb/cockroach:${COCKROACH_VERSION:-v22.2.8}
networks:
- cockroachdb-training-shared
- cockroachdb-training-dc1
command: start — logtostderr — insecure — locality=datacenter=dc-1 — join=roach-0,roach-1,roach-2
ports:
- 8082:8080
- 26259:26257
roach-3:
container_name: roach-3
hostname: roach-3
image: cockroachdb/cockroach:${COCKROACH_VERSION:-v22.2.8}
networks:
- cockroachdb-training-shared
- cockroachdb-training-dc1
command: start — logtostderr — insecure — locality=datacenter=dc-1 — join=roach-0,roach-1,roach-2
ports:
- 8083:8080
- 26260:26257
# DC 2 nodes
roach-4:
container_name: roach-4
hostname: roach-4
image: cockroachdb/cockroach:${COCKROACH_VERSION:-v22.2.8}
networks:
- cockroachdb-training-shared
- cockroachdb-training-dc2
command: start — logtostderr — insecure — locality=datacenter=dc-2 — join=roach-0,roach-1,roach-2
ports:
- 8084:8080
- 26261:26257
roach-5:
container_name: roach-5
hostname: roach-5
image: cockroachdb/cockroach:${COCKROACH_VERSION:-v22.2.8}
networks:
- cockroachdb-training-shared
- cockroachdb-training-dc2
command: start — logtostderr — insecure — locality=datacenter=dc-2 — join=roach-0,roach-1,roach-2
ports:
- 8085:8080
- 26262:26257
Starting up the Cluster with
docker-compose up
Starting roach-3 … done
Starting roach-5 … done
Starting roach-2 … done
Starting roach-0 … done
Starting roach-1 … done
Starting roach-4 … done
Establishes a 6 node cluster across 3 datacenters with a ycsb database loaded up.
Isolate a node with the following command we can see node 4 is suspect and soon to be dead.
docker network disconnect cockroachdb-training-shared roach-4
While connected to node 4 SQL client before the network partition, I am able to run follower reads by specifying the maximum tolerated stalness for single key lookups only.
docker exec -it roach-4 /cockroach/cockroach sql — insecure
root@localhost:26257/ycsb> select ycsb_key from usertable AS OF SYSTEM TIME with_max_staleness(‘10s’) where ycsb_key in (‘user9999414122401537262’);
ycsb_key
— — — — — — — — — — — — — -
user9999414122401537262
(1 row)
Time: 2ms total (execution 1ms / network 0ms)
root@localhost:26257/ycsb> select ycsb_key from usertable AS OF SYSTEM TIME with_max_staleness(‘10s’) where ycsb_key in (‘user9999414122401537262’);
ycsb_key
— — — — — — — — — — — — — -
user9999414122401537262
(1 row)
Time: 2ms total (execution 1ms / network 0ms)
root@localhost:26257/ycsb> select ycsb_key from usertable AS OF SYSTEM TIME with_max_staleness(‘10s’) where ycsb_key in (‘user5566506082736452842’);
ycsb_key
— — — — — — — — — — — — — -
user5566506082736452842
(1 row)
Time: 2ms total (execution 2ms / network 0ms)
There are limitations which have to followed as per the issue, and it appears single key reads are the only implemented stage. This is a good start to enhance reads to CRDB during network partitions.