How to track Contention with CockroachDB

Andrew Deally
3 min readJul 6, 2022

--

In a database, Contention occurs when multiple processes are trying to access the same keys at the same time or relatively close in time. Since CockroachDB operates at serializable isolation such conflicts will occur and are detailed under transactions and one of the transactions will wait in a queue. When this occurs, the transaction latency will by impacted but the duration of the contention wait event. In Cockroach V22, contention tables were introduced to allow admininstrators to track contention events and the offending SQL statements/transactions. This article demonstrates how to track contention using the new crdb_internal tables.

YCSB has a method to create contention this is the command

bin/ycsb run jdbc -P workloads/workloada -p db.driver=org.postgresql.Driver -cp lib/postgresql-42.2.4.jar -p db.url=”jdbc:postgresql://127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true” -p db.user=root -p db.passwd=”” -p requestdistribution=uniform -p recordcount=10000 -p jdbc.batchupdateapi=true -p db.batchsize=10 -p insertorder=ordered -p insertstart=2400 -p insertcount=4 -p operationcount=100000 -threads 4

I will run 4 threads which select and update key ranges 2400–2403 creating contention and CRDB will block and this demo is great as its short lived implicit transactions

Nothing is running and I check the new contention tables

SELECT collection_ts , contention_duration , sw.app_name as waiting_app_name , sb.app_name as blocking_app_name , waiting_txn_fingerprint_id , blocking_txn_fingerprint_id , t.name as table_name , i.index_name , key FROM ( SELECT collection_ts, contention_duration, waiting_txn_fingerprint_id, blocking_txn_fingerprint_id, key_parts[2]::INT AS table_id, key_parts[3]::INT AS index_id, key FROM ( SELECT collection_ts, contention_duration, waiting_txn_fingerprint_id, blocking_txn_fingerprint_id, regexp_split_to_array(crdb_internal.pretty_key(contending_key, 0), ‘/’) AS key_parts, crdb_internal.pretty_key(contending_key, 0) as key FROM crdb_internal.transaction_contention_events)) e JOIN crdb_internal.table_indexes i ON e.index_id = i.index_id AND e.table_id = i.descriptor_id JOIN crdb_internal.tables t ON e.table_id = t.table_id LEFT JOIN crdb_internal.cluster_transaction_statistics sw ON e.waiting_txn_fingerprint_id = sw.fingerprint_id LEFT JOIN crdb_internal.cluster_transaction_statistics sb ON e.blocking_txn_fingerprint_id = sb.fingerprint_id WHERE t.database_name = current_database() AND collection_ts >= NOW() — INTERVAL ‘1 MINUTES’

collection_ts | contention_duration | waiting_app_name | blocking_app_name | waiting_txn_fingerprint_id | blocking_txn_fingerprint_id | table_name | index_name | key

— — — — — — — — + — — — — — — — — — — -+ — — — — — — — — — + — — — — — — — — — -+ — — — — — — — — — — — — — — + — — — — — — — — — — — — — — -+ — — — — — — + — — — — — — + — — —

(0 rows)

Time: 28ms total (execution 27ms / network 0ms)

kicking off contention workload

bin/ycsb run jdbc -P workloads/workloada -p db.driver=org.postgresql.Driver -cp lib/postgresql-42.2.4.jar -p db.url=”jdbc:postgresql://127.0.0.1:26257/defaultdb?autoReconnect=true&sslmode=disable&ssl=false&reWriteBatchedInserts=true” -p db.user=root -p db.passwd=”” -p requestdistribution=uniform -p recordcount=10000 -p jdbc.batchupdateapi=true -p db.batchsize=10 -p insertorder=ordered -p insertstart=2400 -p insertcount=4 -p operationcount=1000000 -threads 4Contention show on statements page

starting with user2400 to user2403, We can see contention in the DB Console

with the following output of the above SQL contention

and

we can see the keys which display contetion user2400–2403

the following view can be used to display the statements

CREATE VIEW public.blocking ( blocking_statement, waiting_statement, contention_count) AS SELECT DISTINCT hce.blocking_statement, substring(ss2.metadata->>’query’, 1, 60) AS waiting_statement, hce.contention_count FROM ( SELECT blocking_txn_fingerprint_id, waiting_txn_fingerprint_id, contention_count, substring(ss.metadata->>’query’, 1, 60) AS blocking_statement FROM ( SELECT encode(blocking_txn_fingerprint_id, ‘hex’) AS blocking_txn_fingerprint_id, encode(waiting_txn_fingerprint_id, ‘hex’) AS waiting_txn_fingerprint_id, count(*) AS contention_count FROM defaultdb.crdb_internal.transaction_contention_events GROUP BY blocking_txn_fingerprint_id, waiting_txn_fingerprint_id), defaultdb.crdb_internal.statement_statistics AS ss WHERE blocking_txn_fingerprint_id = encode(ss.transaction_fingerprint_id, ‘hex’)) AS hce, defaultdb.crdb_internal.statement_statistics AS ss2 WHERE ( hce.blocking_txn_fingerprint_id != ‘0000000000000000’ AND hce.waiting_txn_fingerprint_id != ‘0000000000000000’) AND hce.waiting_txn_fingerprint_id = encode(ss2.transaction_fingerprint_id, ‘hex’) ORDER BY contention_count DESC

there a great article on why understanding contention is important.

--

--