cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaolong Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-13115) Read repair is not blocking repair to finish in foreground repair
Date Tue, 10 Jan 2017 23:01:00 GMT
Xiaolong Jiang created CASSANDRA-13115:
------------------------------------------

             Summary: Read repair is not blocking repair to finish in foreground repair
                 Key: CASSANDRA-13115
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13115
             Project: Cassandra
          Issue Type: Bug
         Environment: ccm on OSX 
            Reporter: Xiaolong Jiang


The code trying to wait(block) for repair result to come back in 3.X is below:
 public void close()
        {
            try
            {
                FBUtilities.waitOnFutures(repairResults, DatabaseDescriptor.getWriteRpcTimeout());
            }
            catch (TimeoutException ex)
            {
                // We got all responses, but timed out while repairing
                int blockFor = consistency.blockFor(keyspace);
                if (Tracing.isTracing())
                    Tracing.trace("Timed out while read-repairing after receiving all {} data
and digest responses", blockFor);
                else
                    logger.debug("Timeout while read-repairing after receiving all {} data
and digest responses", blockFor);

                throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
            }
        }

in DataResolver class, but this close method is never called and it's also not auto close(RepairMergeListener
is not extending from AutoCloseable/CloseableIterator) which means we never wait for repair
to finish before returning final result. 

The steps to reproduce:
1. create some keyspace/table with RF = 2
2. start 2 nodes using ccm
3. stop node2
4. disable node1 hinted hand off
5. write some data to node1 with consistency level one
6. start node2
7. query some data from node1 
This should trigger read repair. I put some log in above close method, and can not see log
print put.

So this bug will basically violate "monotonic quorum reads " guarantee. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message