cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled
Date Fri, 12 Aug 2011 14:33:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084154#comment-13084154
] 

Jonathan Ellis commented on CASSANDRA-2034:
-------------------------------------------

- there's an unused overload of MS.addCallback
- shouldHint should probably be a CallbackInfo method
- the .warn in scheduleMH should be .error
- any reason scheduleMH is protected instead of private?
- technically MS.shutdown should probably collect the Futures of the hints being written and
wait on them 

in
{code}
if (consistencyLevel != null && consistencyLevel == ConsistencyLevel.ANY)
{code}

did you mean this?
{code}
if (responseHandler != null && consistencyLevel == ConsistencyLevel.ANY)
{code}

- not a big fan of the "subclass just exposes a different constructor than its parent" idiom.
 I think the normal expectation is that a subclass encapsulates some kind of different behavior
than its parent.  I'd get rid of CIWM and just expose a with-Message constructor in CallbackInfo.
- Would also make CI message and isMutation final
- avoid declaring @Override on abstract methods (looking at writeLocalHint Runnable, also
CTAF, may be others)
- avoid comments that repeat what the code says, like "// One more task in the hints queue."
- would name hintCounter -> totalHints (I had to look at usages to figure out what it does)
- there's no actual hint queue anymore so would name currentHintsQueueSize -> hintsInProgress
(similarly, maxHintsQueueSize)
- prefer camelCase variable names (CTAF overall_timeout)
- unit.toMillis(timeout) would be more idiomatic than TimeUnit.MILLISECONDS.convert(timeout,
unit)
- otherwise CTAF is a good clean encapsulation, nice job
- generics is upset about "hintsFutures.add(new CreationTimeAwareFuture(hintfuture))" -- can
you fix the unchecked warning there?
- unnecessary whitespace added to the method below "// wait for writes.  throws TimeoutException
if necessary"
- would prefer to avoid allocating the hintFutures list unless we actually need to write hints,
since this is on the write inner loop
- still think we can simplify sendToHinted by getting rid of getHintedEndpoints and operating
directly on the raw writeEndpoints (and consulting FD to decide whether to write a hint)
- currentHintsQueueSize increment needs to be done OUTSIDE the runnable or it will never get
above the number of task executors
- it would be nice if we could tell the CallbackInfo whether the original write reached ConsistencyLevel.
 Maybe we could do this by changing isMutation to volatile isHintRequired, something like
that.  (And just set it to false if CL is not reached.)  If not, we don't have to write hints
for timed out replicas -- this will help avoiding OOM if nodes in the cluster start getting
overloaded and dropping writes.  Otherwise, the coordinator could run out of memory queuing
up hints for writes that timed out and the client will retry.

> Make Read Repair unnecessary when Hinted Handoff is enabled
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-2034
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Patricio Echague
>             Fix For: 1.0
>
>         Attachments: 2034-formatting.txt, CASSANDRA-2034-trunk-v10.patch, CASSANDRA-2034-trunk-v11.patch,
CASSANDRA-2034-trunk-v11.patch, CASSANDRA-2034-trunk-v2.patch, CASSANDRA-2034-trunk-v3.patch,
CASSANDRA-2034-trunk-v4.patch, CASSANDRA-2034-trunk-v5.patch, CASSANDRA-2034-trunk-v6.patch,
CASSANDRA-2034-trunk-v7.patch, CASSANDRA-2034-trunk-v8.patch, CASSANDRA-2034-trunk-v9.patch,
CASSANDRA-2034-trunk.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling HH means
RR/AES will have less work to do, but you can't disable RR entirely in most situations since
HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the client normally
after ConsistencyLevel is achieved, but after RpcTimeout we check the responseHandler write
acks and write local hints for any missing targets.
> This would making disabling RR when HH is enabled a much more reasonable option, which
has a huge impact on read throughput.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message