Using YCSB to simulate multi regional workloads with CockroachDB
Part I
The focus of this series of blogs is to review running the YCSB framework in a single region database cluster and then explore running across multiple regions. The latter represents a strategy which most customers are adopting today to offer a maximum availability data service across their enterprise. I will illustrate the ease of migrating a YCSB deployment from single region to multi-region without the cost of compromising consistency using CockroachDB.
What is YCSB and why use it…
As per wikipedia, “YCSB (Yahoo! Cloud Serving Benchmark) is an open-source specification and program suite for evaluating retrieval and maintenance capabilities of computer programs. It is often used to compare relative performance of NoSQL database management systems.” That said, YCSB has been a widely accepted benchmark tool for relational databases also because it represents a set of clients performing simple CRUD operations which can be configured to a various mix of insert/read/update/scan and read-update operations. The data model and client are simple and can be reconfigured to support query behavior which can represent workloads across industries such as session management (cookies, shopping carts, microservices), metadata management (user profiles, IOT devices, password management) and fast lookups (product lookups or financial lookups). Furthermore, the data model can be partitioned to illustrate mutliregional workloads across these regions/datacenters which makes YCSB a great tool to illustrate ACID transactions across multiple locations operating on the same rows within a database.
What is CockroachDB and why use it…
CockroachDB is a distributed relational database which provides a SQL endpoint which is always on across multiple SQL nodes in and across data centers/regions on both private and public clouds. Take a minute to review the video and design doc.
CockroachDB also provides strong consistency in the face of node and regional failures while maintaining a recovery point objective of zero.
Preparing an environment is 4 simple steps:
- Create and Start a Cockroach Cluster and a node which will be used as the ycsb client (Create Cluster with Manual or Orchestrated)
- On the client, download the latest YCSB from releases
- On the client, download the Postgres JDBC driver placing under YCSB root /lib
- Create the database schema with either create DDL or create utility
- Use on of the following connection options : JDBC load balancing, Application load balancing, haproxy
Running a workload requires 2 steps:
- Ycsb load
- Ycsb run
Data can be loaded using 3 methods, either using a single bulk load, parallel loading method or the Cockroach IMPORT into command which will generate the same data set as the YCSB client. Using the following commands to create schema and load 10M rows into a single region with 3 nodes.
create schema with following command
java -cp lib/jdbc-binding-0.17.0.jar:lib/postgresql-42.2.4.jar site.ycsb.db.JdbcDBCreateTable -p db.driver=org.postgresql.Driver -p db.url=”jdbc:postgresql://drew-0001:26257/ycsb?autoReconnect=true” -p db.user=root -p db.passwd=”” -n usertable
table DDL
CREATE TABLE public.usertable ( ycsb_key VARCHAR NOT NULL, field0 STRING NULL, field1 STRING NULL, field2 STRING NULL, field3 STRING NULL, field4 STRING NULL, field5 STRING NULL, field6 STRING NULL, field7 STRING NULL, field8 STRING NULL, field9 STRING NULL, CONSTRAINT “primary” PRIMARY KEY (ycsb_key ASC), FAMILY “primary” (ycsb_key, field0, field1, field2, field3, field4, field5, field6, field7, field8, field9) )
loaded 10MM rows
IMPORT into ycsb.public.usertable (ycsb_key, field0, field1, field2, field3, field4, field5, field6, field7, field8, field9) CSV DATA (‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=3333&row-start=0&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’, ‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=6666&row-start=3333&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’, ‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=9999&row-start=6666&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’, ‘workload:///csv/ycsb/usertable?families=true&insert-count=10000000&insert-hash=false&insert-start=0&json=false&max-scan-length=1000&min-scan-length=1&record-count=10000000&request-distribution=zipfian&row-end=10000&row-start=9999&scan-length-distribution=uniform&seed=1&select-for-update=true&splits=0&time-string=false&version=1.0.0&zero-padding=1’) WITH “nullif” = ‘NULL’;
job_id | status | fraction_completed | rows | index_entries | bytes
— — — — — — — — — — -+ — — — — — -+ — — — — — — — — — — + — — — — — + — — — — — — — -+ — — — — — — —
717175967322210305 | succeeded | 1 | 10000000 | 0 | 10418888890
(1 row)
inspecting the key ranges for the user table
select substring(ycsb_key,5,9)::int, ycsb_key from usertable order by 1 limit 10;
substring | ycsb_key
0 | user0
1 | user1
2 | user2
3 | user3
select substring(ycsb_key,5,9)::int, ycsb_key from usertable order by 1 limit 10 OFFSET 2000;
2000 | user2000
2001 | user2001
2002 | user2002
2003 | user2003
2004 | user2004
2005 | user2005
2006 | user2006
2007 | user2007
2008 | user2008
2009 | user2009
and lastly what is the distribution of keys across ranges
select substring(ycsb_key,1,5), count(1) from usertable group by 1 order by 1;
user0 | 1
user1 | 1111111
user2 | 1111111
user3 | 1111111
user4 | 1111111
user5 | 1111111
user6 | 1111111
user7 | 1111111
user8 | 1111111
user9 | 1111111
as this is a distributed database, data is shared across ranges or what is otherwise known as partitions.
user18..user912
each range has 3 replicas and in the case of this initial cluster deployment we have 3 nodes
located in the east
Now run ycsb workload A (50% reads 50% updates) across this single region of 3 nodes from a local node in the east. The nodes are located on GCE east1 and are only 2 cores (4 vcpu) so the most throughput we can expect for this workload is under 3000 QPS and larger nodes will have higher throughput numbers as well as more nodes.
bin/ycsb run jdbc -P workloads/workloada -p db.driver=org.postgresql.Driver -cp lib/postgresql-42.2.4.jar -p db.url=”jdbc:postgresql://drew-0001:26257,drew-0002:26257,drew-0003:26257/ycsb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true” -p db.user=root -p db.passwd=”” -threads 6 -p requestdistribution=uniform -p recordcount=10000000 -p jdbc.batchupdateapi=true -p db.batchsize=10 -p insertorder=ordered -p operationcount=250000
using JDBC load balancing across nodes in the region
-p db.url=”jdbc:postgresql://drew-0001:26257,drew-0002:26257,drew-0003:26257/ycsb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true&loadBalanceHosts=true”
setting 6 client threads
-threads 6
where data distribution is uniform and inserts are ordered across the entire key space of 10MM rows running for 250k executions of reads and writes.
-p requestdistribution=uniform -p recordcount=10000000 -p insertorder=ordered -p operationcount=250000
The total time to execute was 93 seconds with an average throughput of 2648 operations per second.
[OVERALL], RunTime(ms), 94388
[OVERALL], Throughput(ops/sec), 2648.6417764970124
[READ], Operations, 125003
[READ], AverageLatency(us), 1160.5032599217618
[READ], MinLatency(us), 459
[READ], MaxLatency(us), 60095
[READ], 95thPercentileLatency(us), 1892
[READ], 99thPercentileLatency(us), 3083
[READ], Return=OK, 125003
[UPDATE], Operations, 124997
[UPDATE], AverageLatency(us), 3319.4158339800156
and the database UI confirms the operations per second
now adding 6 nodes across regions in east4 and central1 and cockroachDB moves replicas across the regions. As I mentioned earlier, not only is this seamless it is also online as the clients are still running in the east1.
1300–1320 to move replicas across 2 new regions also the ranges have merged into fewer ranges
Now, before getting into running a workload across the 3 regions let’s take a detour and review how CockroachDB maintains consistency and services reads and writes. For each range of data CockroachDB creates 3 or more replicas. Note the range below labeled “Range 0–9”, is a range of data containing rows with a key of “0” to “9”. These ranges can be either a table or an index with an associated set of data. In the illustration below the replication factor is 3 and for the key range “0”-”9”. The 3 replicas are stored on nodes 1,3 and 4, sometimes I refer to them as mirrors. One of these replicas is a leader which coordinates all reads and writes for the keys in the range. As long as a quorum of these replicas are online, clients are able to read and write data to and from the keys. Also a quorum of these replicas are in agreement before acknowledging any mutations to the rows. If at any time a quorum minus one of the replicas are unavailable, the data is still available for reads and writes following the last committed state thus providing a consistent read after write.
When any reads or writes arrive in the system, Cockroach will find the lease holder range in the region/datacenter to perform the operation. The service time for single key read maybe in 1 to 2 ms and a write in the 2 to 4 ms. Even when the nodes are 10 to 100 ms apart, CockroachDB will maintain these operations while incurring some higher latency. I will spend a little more time explaining the various transactional latencies for a few scenarios in the next blog.