CockroachDB Statement Execution Metrics

Andrew Deally
7 min readJul 19, 2022

--

Statement execution statistics are collected to an in-memory table crdb_internal.cluster_statement_statistics which is a collection of the crdb_internal.node_statement_statitsics (in-memory on each node). Every ten minutes , the nodes (consequently the cluster) table is flushed to the system.statement_statistics (persisted to disk). The cluster_statement_statistics is a good source for gathering execution details in real-time with ten minute aggregation stats grouped across those intervals which eventually are aggregated to one hour and then persisted. Using combined statistics, you can select from crdb_internal.statement_statistics and it will return the one hour aggregation from the in-memory (crdb_internal.cluster_statement_statistics) and the persisted (system.statement_statistics). The flush from in-memory to persisted crdb_internal tables is triggered by 3 cluster level settings sql.stats.flush.interval, sql.metrics.max_mem_stmt_fingerprints ,sql.metrics.max_mem_txn_fingerprints ,sql.stats.aggregation.interval and refer to explanation on how this works.

Now let’s take a look at how this works with a simple workload which executes statements taken from my last article on order trade processing. Using the following set of SQL statements

select now(),aggregated_ts, statistics->’statistics’->’cnt’ as cnt, statistics->’statistics’->’svcLat’ as latency ,metadata->’query’ from crdb_internal.cluster_statement_statistics where app_name = ‘trades’ order by aggregated_ts;

select now(),aggregated_ts, statistics->’statistics’->’cnt’ as cnt, statistics->’statistics’->’svcLat’ as latency ,metadata->’query’ from crdb_internal.statement_statistics where app_name = ‘trades’ order by aggregated_ts;

select now(),aggregated_ts, statistics->’statistics’->’cnt’ as cnt, statistics->’statistics’->’svcLat’ as latency ,metadata->’query’ from system.statement_statistics where app_name = ‘trades’ order by aggregated_ts;

We can gather statement execution details to any source for further analysis. Also, We can confirm the behavior of these statistics in the DB console statements page. By sort statements with a filter on application name equal to trades, We see 4 statements with 2 trade execution cycles accounting for a ratio of 3,3,3,6 executions within the past hour. The flow is 1 order received , 1 order submissions, 1 orders acknowledgements and 2 order executions.

The current one hour sample has 20 minutes before the top of the hour, and we can execute more order cycles and monitor the execution statistics. Firing off 12 cycles,

We can see the 12,12,12, 24 executions of statements and their latency

And we can see from the crdb_internal.statement_statistics all the executions in the hour with 15,15,15 and 30 statement executions for 15 trade cycles.

After complete 1000 trade cycles before the top of the hour, we can see the total of the new 1000,1000,1000 and 2000 trade cycles along with the prior executions

With the total of 1015 order received , 1015 order submissions, 1015 orders acknowledgements and 2030 order executions.

Querying the system.statement_statics we see all entries for the 3 different tests.

select now(),aggregated_ts, statistics->’statistics’->’cnt’ as cnt, statistics->’statistics’->’svcLat’ as latency , (metadata->’query’)::string as query ,

(statistics->’statistics’->’svcLat’->’mean’)::float8 * 1000 as latency_ms from

system.statement_statistics where app_name = ‘trades’ and aggregated_ts > ‘2022–07–19 02:59’ order by query;

-[ RECORD 1 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 12

latency | {“mean”: 0.06394161408333333, “sqDiff”: 0.0000029460780948509462}

query | “UPDATE orders_trail SET internal_status = $1, modified_by = $2, modified_time = $3, trans_cdc_id = $4 WHERE (((locality = $5) AND (clientorderid = $6)) OR ((locality = $7) AND (clientorderid = $8))) OR ((locality = $9) AND (clientorderid = $10))”

latency_ms | 63.941614083333334

-[ RECORD 2 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 1000

latency | {“mean”: 0.059567134813, “sqDiff”: 0.11685686182234246}

query | “UPDATE orders_trail SET internal_status = $1, modified_by = $2, modified_time = $3, trans_cdc_id = $4 WHERE (((locality = $5) AND (clientorderid = $6)) OR ((locality = $7) AND (clientorderid = $8))) OR ((locality = $9) AND (clientorderid = $10))”

latency_ms | 59.567134813

-[ RECORD 3 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 3

latency | {“mean”: 0.06332779966666667, “sqDiff”: 1.5247227169066478E-7}

query | “UPDATE orders_trail SET internal_status = $1, modified_by = $2, modified_time = $3, trans_cdc_id = $4 WHERE (((locality = $5) AND (clientorderid = $6)) OR ((locality = $7) AND (clientorderid = $8))) OR ((locality = $9) AND (clientorderid = $10))”

latency_ms | 63.32779966666667

-[ RECORD 4 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 6

latency | {“mean”: 0.04099520033333333, “sqDiff”: 0.000011574360181605335}

query | “WITH batch_cte_0 AS (UPDATE orders SET executed_quantity = executed_quantity + $1, status = $2, modified_by = $3, modified_time = $4, trans_cdc_id = $5 WHERE (locality = $6) AND (orderid = $7) RETURNING _), batch_cte_1 AS (UPDATE orders SET executed_quantity = executed_quantity + $8, status = $9, modified_by = $10, modified_time = $11, trans_cdc_id = $12 WHERE (locality = $13) AND (orderid = $14) RETURNING _), batch_cte_2 AS (UPDATE orders SET executed_quantity = executed_quantity + $15, status = $16, modified_by = $17, modified_time = $18, trans_cdc_id = $19 WHERE (locality = $20) AND (orderid = $21) RETURNING _), cte5 AS (INSERT INTO activity VALUES ($22, $23, __more9__), (__more2__) RETURNING _) SELECT _”

latency_ms | 40.99520033333333

-[ RECORD 5 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 2000

latency | {“mean”: 0.04137242652650007, “sqDiff”: 0.0688685799624998}

query | “WITH batch_cte_0 AS (UPDATE orders SET executed_quantity = executed_quantity + $1, status = $2, modified_by = $3, modified_time = $4, trans_cdc_id = $5 WHERE (locality = $6) AND (orderid = $7) RETURNING _), batch_cte_1 AS (UPDATE orders SET executed_quantity = executed_quantity + $8, status = $9, modified_by = $10, modified_time = $11, trans_cdc_id = $12 WHERE (locality = $13) AND (orderid = $14) RETURNING _), batch_cte_2 AS (UPDATE orders SET executed_quantity = executed_quantity + $15, status = $16, modified_by = $17, modified_time = $18, trans_cdc_id = $19 WHERE (locality = $20) AND (orderid = $21) RETURNING _), cte5 AS (INSERT INTO activity VALUES ($22, $23, __more9__), (__more2__) RETURNING _) SELECT _”

latency_ms | 41.37242652650007

-[ RECORD 6 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 24

latency | {“mean”: 0.04061466725, “sqDiff”: 0.000028095182481034526}

query | “WITH batch_cte_0 AS (UPDATE orders SET executed_quantity = executed_quantity + $1, status = $2, modified_by = $3, modified_time = $4, trans_cdc_id = $5 WHERE (locality = $6) AND (orderid = $7) RETURNING _), batch_cte_1 AS (UPDATE orders SET executed_quantity = executed_quantity + $8, status = $9, modified_by = $10, modified_time = $11, trans_cdc_id = $12 WHERE (locality = $13) AND (orderid = $14) RETURNING _), batch_cte_2 AS (UPDATE orders SET executed_quantity = executed_quantity + $15, status = $16, modified_by = $17, modified_time = $18, trans_cdc_id = $19 WHERE (locality = $20) AND (orderid = $21) RETURNING _), cte5 AS (INSERT INTO activity VALUES ($22, $23, __more9__), (__more2__) RETURNING _) SELECT _”

latency_ms | 40.61466725

-[ RECORD 7 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 12

latency | {“mean”: 0.0032308703333333334, “sqDiff”: 0.0000025438111555886663}

query | “WITH cte2 AS (INSERT INTO orders_trail VALUES ($1, $2, __more10__), (__more2__) RETURNING _), cte1 AS (INSERT INTO orders VALUES ($37, $38, __more10__), (__more2__) RETURNING _) SELECT _”

latency_ms | 3.2308703333333333

-[ RECORD 8 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 3

latency | {“mean”: 0.003913755, “sqDiff”: 0.0000011188094042579999}

query | “WITH cte2 AS (INSERT INTO orders_trail VALUES ($1, $2, __more10__), (__more2__) RETURNING _), cte1 AS (INSERT INTO orders VALUES ($37, $38, __more10__), (__more2__) RETURNING _) SELECT _”

latency_ms | 3.9137549999999997

-[ RECORD 9 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 1000

latency | {“mean”: 0.0035162634279999987, “sqDiff”: 0.00021040338375521073}

query | “WITH cte2 AS (INSERT INTO orders_trail VALUES ($1, $2, __more10__), (__more2__) RETURNING _), cte1 AS (INSERT INTO orders VALUES ($37, $38, __more10__), (__more2__) RETURNING _) SELECT _”

latency_ms | 3.5162634279999985

-[ RECORD 10 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 3

latency | {“mean”: 0.004645276, “sqDiff”: 3.965867442419999E-7}

query | “WITH cte3 AS (UPDATE orders SET status = $1, modified_by = $2, modified_time = $3, trans_cdc_id = $4 WHERE (((locality = $5) AND (orderid = $6)) OR ((locality = $7) AND (orderid = $8))) OR ((locality = $9) AND (orderid = $10)) RETURNING _), cte4 AS (INSERT INTO activity VALUES ($11, $12, __more9__), (__more2__) RETURNING _) SELECT _”

latency_ms | 4.645276

-[ RECORD 11 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 1000

latency | {“mean”: 0.004550776761999993, “sqDiff”: 0.00010754873547948524}

query | “WITH cte3 AS (UPDATE orders SET status = $1, modified_by = $2, modified_time = $3, trans_cdc_id = $4 WHERE (((locality = $5) AND (orderid = $6)) OR ((locality = $7) AND (orderid = $8))) OR ((locality = $9) AND (orderid = $10)) RETURNING _), cte4 AS (INSERT INTO activity VALUES ($11, $12, __more9__), (__more2__) RETURNING _) SELECT _”

latency_ms | 4.550776761999993

-[ RECORD 12 ]

now | 2022–07–19 04:18:32.716553+00

aggregated_ts | 2022–07–19 03:00:00+00

cnt | 12

latency | {“mean”: 0.004366677916666667, “sqDiff”: 7.446070380089179E-7}

query | “WITH cte3 AS (UPDATE orders SET status = $1, modified_by = $2, modified_time = $3, trans_cdc_id = $4 WHERE (((locality = $5) AND (orderid = $6)) OR ((locality = $7) AND (orderid = $8))) OR ((locality = $9) AND (orderid = $10)) RETURNING _), cte4 AS (INSERT INTO activity VALUES ($11, $12, __more9__), (__more2__) RETURNING _) SELECT _”

latency_ms | 4.366677916666667

And its confirmed in the UI for the one hour window

Next, crdb_internal and system transactions tables.

--

--

No responses yet