cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13983) Support a means of logging all queries as they were invoked
Date Mon, 04 Dec 2017 22:31:00 GMT


Ariel Weisberg commented on CASSANDRA-13983:

Sorry Blake the Chronicle folks just started publishing their latest artifacts to maven again.
I upgraded to their latest because they made it sound like there have been a lot of bug fixes
since February when the version were were using was published. I decided to get it out of
the way now since we avoid updating dependencies in minor releases.

I rebased and squashed and on top of that I have a commit doing the upgrade.

The unit tests all passed modulo some minor hiccups around changes in behavior in Chronicle.
One test failed because Chronicle creates a file before you have done any appends now. Another
was failing because a change in Chronicle was preventing finalization.

> Support a means of logging all queries as they were invoked
> -----------------------------------------------------------
>                 Key: CASSANDRA-13983
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: CQL, Observability, Testing, Tools
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 4.0
> For correctness testing it's useful to be able to capture production traffic so that
it can be replayed against both the old and new versions of Cassandra while comparing the
> Implementing this functionality once inside the database is high performance and presents
less operational complexity.
> In [this patch|] there is an implementation
of a full query log that logs uses chronicle-queue (apache licensed, the maven artifacts are
labeled incorrectly in some cases, dependencies are also apache licensed) to implement a rotating
log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum weight sitting
in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be dropped
> * Disk utilization is bounded by deleting old log segments once a configurable size is
> * The on disk serialization uses a flexible schema binary format (chronicle-wire) making
it easy to skip unrecognized fields, add new ones, and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk data), logging
path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which can dump
in a human readable format full query logs as well as follow active full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs to two
different clusters and compare the result and check for inconsistencies. <- Actively working
on getting this done
> * Log not just queries but their results to facilitate a comparison between the original
query result and the replayed result. <- Really just don't have specific use case at the
> * "Consistent" query logging allowing replay to fully replicate the original order of
execution and completion even in the face of races (including CAS). <- This is more speculative

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message