cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled
Date Thu, 04 Aug 2011 21:43:28 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079618#comment-13079618
] 

Jonathan Ellis commented on CASSANDRA-2034:
-------------------------------------------

So I proposed two mutually exclusive approaches here:

- return to the client normally after ConsistencyLevel is achieved, but after RpcTimeout we
check the responseHandler write acks and write local hints for any missing targets
- add a separate executor here, with a blocking, capped queue. When we go to do a hint-after-failure
we enqueue [and] wait for the write and then return success to the client

The difference can be summarized as: do we wait for all hints to be written before returning
to the client?  If you do, then CL.ONE write latency becomes worst-of-N instead of best-of-N.
 But, you are guaranteed that successful writes have been hinted (if necessary) so you do
not have to repair unless there is hardware permadeath.  (Otherwise you would have to repair
after power failure or crashes, too.)

I'm inclined to think that the first option is better, partly because writes are *fast* so
worst-of-N really isn't that different from best-of-N.  Also, we could use CASSANDRA-2819
to reduce the default write timeout while still being conservative for reads (which might
hit disk).

> Make Read Repair unnecessary when Hinted Handoff is enabled
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-2034
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Patricio Echague
>             Fix For: 1.0
>
>         Attachments: 2034-formatting.txt, CASSANDRA-2034-trunk-v2.patch, CASSANDRA-2034-trunk-v3.patch,
CASSANDRA-2034-trunk-v4.patch, CASSANDRA-2034-trunk-v5.patch, CASSANDRA-2034-trunk-v6.patch,
CASSANDRA-2034-trunk-v7.patch, CASSANDRA-2034-trunk.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling HH means
RR/AES will have less work to do, but you can't disable RR entirely in most situations since
HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the client normally
after ConsistencyLevel is achieved, but after RpcTimeout we check the responseHandler write
acks and write local hints for any missing targets.
> This would making disabling RR when HH is enabled a much more reasonable option, which
has a huge impact on read throughput.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message