Using YCSB to simulate multi regional workloads with CockroachDB

Andrew Deally
7 min readDec 9, 2021

--

Part I

The focus of this series of blogs is to review running the YCSB framework in a single region database cluster and then explore running across multiple regions. The latter represents a strategy which most customers are adopting today to offer a maximum availability data service across their enterprise. I will illustrate the ease of migrating a YCSB deployment from single region to multi-region without the cost of compromising consistency using CockroachDB.

What is YCSB and why use it…

As per wikipedia, “YCSB (Yahoo! Cloud Serving Benchmark) is an open-source specification and program suite for evaluating retrieval and maintenance capabilities of computer programs. It is often used to compare relative performance of NoSQL database management systems.” That said, YCSB has been a widely accepted benchmark tool for relational databases also because it represents a set of clients performing simple CRUD operations which can be configured to a various mix of insert/read/update/scan and read-update operations. The data model and client are simple and can be reconfigured to support query behavior which can represent workloads across industries such as session management (cookies, shopping carts, microservices), metadata management (user profiles, IOT devices, password management) and fast lookups (product lookups or financial lookups). Furthermore, the data model can be partitioned to illustrate mutliregional workloads across these regions/datacenters which makes YCSB a great tool to illustrate ACID transactions across multiple locations operating on the same rows within a database.

What is CockroachDB and why use it…

CockroachDB is a distributed relational database which provides a SQL endpoint which is always on across multiple SQL nodes in and across data centers/regions on both private and public clouds. Take a minute to review the video and design doc.

https://youtu.be/VgXiMcbGwzQ
https://youtu.be/VgXiMcbGwzQ

CockroachDB also provides strong consistency in the face of node and regional failures while maintaining a recovery point objective of zero.

Preparing an environment is 4 simple steps:

Running a workload requires 2 steps:

  • Ycsb load
  • Ycsb run

Data can be loaded using 3 methods, either using a single bulk load, parallel loading method or the Cockroach IMPORT into command which will generate the same data set as the YCSB client. Using the following commands to create schema and load 10M rows into a single region with 3 nodes.

create schema with following command

java -cp lib/jdbc-binding-0.17.0.jar:lib/postgresql-42.2.4.jar site.ycsb.db.JdbcDBCreateTable -p db.driver=org.postgresql.Driver -p db.url=”jdbc:postgresql://drew-0001:26257/ycsb?autoReconnect=true” -p db.user=root -p db.passwd=”” -n usertable

table DDL

CREATE TABLE public.usertable ( ycsb_key VARCHAR NOT NULL, field0 STRING NULL, field1 STRING NULL, field2 STRING NULL, field3 STRING NULL, field4 STRING NULL, field5 STRING NULL, field6 STRING NULL, field7 STRING NULL, field8 STRING NULL, field9 STRING NULL, CONSTRAINT “primary” PRIMARY KEY (ycsb_key ASC), FAMILY “primary” (ycsb_key, field0, field1, field2, field3, field4, field5, field6, field7, field8, field9) )

loaded 10MM rows

IMPORT into ycsb.public.usertable (ycsb_key, field0, field1, field2, field3, field4, field5, field6, field7, field8, field9) CSV DATA (‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=3333&row-start=0&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’, ‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=6666&row-start=3333&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’, ‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=9999&row-start=6666&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’, ‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=10000&row-start=9999&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’) WITH “nullif” = ‘NULL’;

job_id | status | fraction_completed | rows | index_entries | bytes

— — — — — — — — — — -+ — — — — — -+ — — — — — — — — — — + — — — — — + — — — — — — — -+ — — — — — — —

717175967322210305 | succeeded | 1 | 10000000 | 0 | 10418888890

(1 row)

inspecting the key ranges for the user table

select substring(ycsb_key,5,9)::int, ycsb_key from usertable order by 1 limit 10;

substring | ycsb_key

0 | user0

1 | user1

2 | user2

3 | user3

select substring(ycsb_key,5,9)::int, ycsb_key from usertable order by 1 limit 10 OFFSET 2000;

2000 | user2000

2001 | user2001

2002 | user2002

2003 | user2003

2004 | user2004

2005 | user2005

2006 | user2006

2007 | user2007

2008 | user2008

2009 | user2009

and lastly what is the distribution of keys across ranges

select substring(ycsb_key,1,5), count(1) from usertable group by 1 order by 1;

user0 | 1

user1 | 1111111

user2 | 1111111

user3 | 1111111

user4 | 1111111

user5 | 1111111

user6 | 1111111

user7 | 1111111

user8 | 1111111

user9 | 1111111

as this is a distributed database, data is shared across ranges or what is otherwise known as partitions.

user18..user912

each range has 3 replicas and in the case of this initial cluster deployment we have 3 nodes

located in the east

Now run ycsb workload A (50% reads 50% updates) across this single region of 3 nodes from a local node in the east. The nodes are located on GCE east1 and are only 2 cores (4 vcpu) so the most throughput we can expect for this workload is under 3000 QPS and larger nodes will have higher throughput numbers as well as more nodes.

bin/ycsb run jdbc -P workloads/workloada -p db.driver=org.postgresql.Driver -cp lib/postgresql-42.2.4.jar -p db.url=”jdbc:postgresql://drew-0001:26257,drew-0002:26257,drew-0003:26257/ycsb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true” -p db.user=root -p db.passwd=”” -threads 6 -p requestdistribution=uniform -p recordcount=10000000 -p jdbc.batchupdateapi=true -p db.batchsize=10 -p insertorder=ordered -p operationcount=250000

using JDBC load balancing across nodes in the region

-p db.url=”jdbc:postgresql://drew-0001:26257,drew-0002:26257,drew-0003:26257/ycsb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true”

setting 6 client threads

-threads 6

where data distribution is uniform and inserts are ordered across the entire key space of 10MM rows running for 250k executions of reads and writes.

-p requestdistribution=uniform -p recordcount=10000000 -p insertorder=ordered -p operationcount=250000

The total time to execute was 93 seconds with an average throughput of 2648 operations per second.

[OVERALL], RunTime(ms), 94388

[OVERALL], Throughput(ops/sec), 2648.6417764970124

[READ], Operations, 125003

[READ], AverageLatency(us), 1160.5032599217618

[READ], MinLatency(us), 459

[READ], MaxLatency(us), 60095

[READ], 95thPercentileLatency(us), 1892

[READ], 99thPercentileLatency(us), 3083

[READ], Return=OK, 125003

[UPDATE], Operations, 124997

[UPDATE], AverageLatency(us), 3319.4158339800156

and the database UI confirms the operations per second

3 nodes in east1 servicing ~ 2700 QPS
breakdown of transactions for read and writes
QPS/TPS with p99 service times

now adding 6 nodes across regions in east4 and central1 and cockroachDB moves replicas across the regions. As I mentioned earlier, not only is this seamless it is also online as the clients are still running in the east1.

second region
third region
fully replicated across all 3 regions over a short period of time

1300–1320 to move replicas across 2 new regions also the ranges have merged into fewer ranges

Now, before getting into running a workload across the 3 regions let’s take a detour and review how CockroachDB maintains consistency and services reads and writes. For each range of data CockroachDB creates 3 or more replicas. Note the range below labeled “Range 0–9”, is a range of data containing rows with a key of “0” to “9”. These ranges can be either a table or an index with an associated set of data. In the illustration below the replication factor is 3 and for the key range “0”-”9”. The 3 replicas are stored on nodes 1,3 and 4, sometimes I refer to them as mirrors. One of these replicas is a leader which coordinates all reads and writes for the keys in the range. As long as a quorum of these replicas are online, clients are able to read and write data to and from the keys. Also a quorum of these replicas are in agreement before acknowledging any mutations to the rows. If at any time a quorum minus one of the replicas are unavailable, the data is still available for reads and writes following the last committed state thus providing a consistent read after write.

When any reads or writes arrive in the system, Cockroach will find the lease holder range in the region/datacenter to perform the operation. The service time for single key read maybe in 1 to 2 ms and a write in the 2 to 4 ms. Even when the nodes are 10 to 100 ms apart, CockroachDB will maintain these operations while incurring some higher latency. I will spend a little more time explaining the various transactional latencies for a few scenarios in the next blog.

--

--

No responses yet