Tenant node Isolation through database partition and sharding
This blog illustrates a method of tenant isolation using table partitioning and sharding to restrict data to a specific node or set of nodes. In a prior blog, I covered a similar concept across regions and now we will peel back to a single region cluster. The result is better resource isolation of a tenant or group of tenants by placing all their data on a set of nodes separate from other tenants. Also the SQL connection to the cluster is controlled by setting the specific nodes for the Tenants data. Using YCSB workload to generate the Tenant data and access. Refer to YCSB on CRDB as a primer.
A Cockroach Cluster is setup with 9 nodes across 3 availability zones in us-east1 b,c,d
Within each AZ, we have 3 nodes with one assigned to a Tenant
Notice the Tenant Labels are A,B and C where each Tenant has a node in each availability zone to ensure high availability for each Tenant.
Now let’s look at the table structure, noting the partitioning of the ycsb_key based on range values.
Along with the zone configurations settings, ranges of ycsb_key are mapped to partitions and partitions are mapped to specific nodes based on the node locality setting +tenant=A, +tenant=B or +tenant=C.
All the keys starting from user1% to user4%, are mapped to Nodes with locality Tenant=A, user5% to user6% are mapped to Nodes with locality Tenant=B and the user7% to user99999 are mapped to nodes with locality Tenant=C. By mapping users to ycsb_keys, we can manage groups of users and queries to the nodes they are assigned to by Tenancy. Connecting a user to anynode in the cluster, their requests will be routed to the proper tenant node to read or write data. Also, restricting user access by setting connection strings to the tenant nodes, will further control the resource usage. You can also isolate a specific user or yscb_key to a dedicated partition.
Tenant usage and workload isolation
Using a YCSB binary, we can specify a range of user keys to simulate read and write access to a specific range of keys to further illustrate workload isolation. For example the following command,
notice the connection string points to nodes 1,5 and 9 [cluster nodes 1,4 and 7 ] which are TenantA where user1% rows are stored. Using “-p insertstart=100000” and “-p insertcount=99999”specifies the starting key user1 through the next 99999 sequential keys in the keyspace. We can see the workload reaches roughly 2500 reads and 2500 writes to keys across the user1% tenant and the SQL latency is only on db nodes 1,4 and 7.
We can also see nodes 1,4 and 7 display resource utilization of CPU.
and write IOPS with little to no impact to other nodes in the cluster.
Finally, the “Average Queries per Node” metric which indicates lease holder utilization indicates the workload is serviced by DB nodes 1,4 and 7 in the Cluster.
Now running users in Tenant=B with keys user5 which is mapped to nodes, 2,5, 8 — we can see that utilization is isolated to those nodes.
Wrapping up, we can see that with partitioning and zone configurations we can isolate reads and writes for a range(s) to specific nodes controlling resources to specific users or tenants.