cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-14812) Multiget Thrift query returns null records after digest mismatch
Date Fri, 07 Dec 2018 14:33:00 GMT


Benedict updated CASSANDRA-14812:
    Summary: Multiget Thrift query returns null records after digest mismatch  (was: Multiget
Thrift query processor skips records in case of digest mismatch)

> Multiget Thrift query returns null records after digest mismatch
> ----------------------------------------------------------------
>                 Key: CASSANDRA-14812
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination, Core
>            Reporter: Sivukhin Nikita
>            Assignee: Benedict
>            Priority: Critical
>              Labels: bug
>             Fix For: 3.0.18
>         Attachments:, requirements.txt,
> It seems that in Cassandra 3.0.0 a nasty bug was introduced in {{multiget}} Thrift
query processing logic. When one tries to read data from several partitions with a single {{multiget}} query
and {{DigestMismatch}} exception is raised during this query processing, request coordinator
prematurely terminates response stream right at the point where the first \{{DigestMismatch}}
error is occurring. This leads to situation where clients "do not see" some data contained
in the database.
> We managed to reproduce this bug in all versions of Cassandra starting with v3.0.0. The
pre-release version 3.0.0-rc2 works correctly. It looks like [refactoring of iterator transformation
hierarchy|] related
to CASSANDRA-9975 triggers incorrect behaviour.
> When concatenated iterator is returned from the [StorageProxy.fetchRows(...)|],
Cassandra starts to consume this combined iterator. Because of {{DigestMismatch}} exception
some elements of this combined iterator contain additional {{ThriftCounter}}, that was added
during [DataResolver.resolve(...)|] execution.
While consuming iterator for many partitions Cassandra calls [BaseIterator.tryGetMoreContents(...)|] method
that must switch from one partition iterator to another in case of exhaustion of the former.
In this case all Transformations contained in the next iterator are applied to the combined
BaseIterator that enumerates partitions sequence which is wrong. This behaviour causes BaseIterator
to stop enumeration after it fully consumes partition with {{DigestMismatch}} error, because
this partition iterator has additional {{ThriftCounter}} data limit.
> The attachment contains the python2 script [^] that reproduces this
bug within 3-nodes ccmlib controlled cluster. Also, there is an extended version of this
script - [^] - that contains more logging information and provides the ability
to test behavior for many Cassandra versions (to run all test cases from you
can call {{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the necessary dependencies
contained in the [^requirements.txt]
> This bug is critical in our production environment because we can't permit any data skip.
> Any ideas about a patch for this issue?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message