cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4189) Improve hints replay
Date Wed, 25 Apr 2012 19:42:17 GMT


Vijay commented on CASSANDRA-4189:

Also, on (mis)completion we flush and force a compaction that should clear out the tombstones
(see CASSANDRA-3733) so I'm skeptical this is a real problem.
May be the above will fix it, the hints CF (about 10GB) is too large for the node in question...
so i have to do more tests.

Sure, in a two node cluster maybe the single threaded nature is a problem, but in any cluster
of appreciable size it's always overload that's an issue, so I don't see much to be gained
by multithreading it.
No the problem is when you have 10's of nodes and they are all in different DC's, it is naturally
throttled by the latency of 100's of milliseconds. Now while replaying hints, the thread gets
stuck replaying the hints to the remote node, no other node gets the hints. What i am suggesting
is to throttle but in a multi threaded way.
> Improve hints replay
> --------------------
>                 Key: CASSANDRA-4189
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2
> Problem: Hints are stored in one row.
> when there are a lot of hints stored and we store Tombstones for the ones which has been
> It might be worth shading the hints based on Hour at which the hints are stored. This
can reduce the complexity of the scanning for hints.
> Problem: Hints replay is too slow and single threaded.
> There are use-case where the hints needs to be replayed ASAP to make the cluster more
> In Multi region cluster, the throttle is already done due to the latency which is in
the order of 100's of millisecond.
> It might be worth trying to replay the hints in parallel and throttle on the number of
bytes read from the disk or use the existing setting of throttle based on sleep interval on
all the threads.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message