cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith <>
Subject Re: Very large HintsColumnFamily
Date Sat, 22 Dec 2012 01:08:59 GMT

Rob Coli <> wrote:

>Before we start.. what version of cassandra?
>On Fri, Dec 21, 2012 at 4:25 PM, Keith Wright <> wrote:
>> This behavior seems to occur if I do a large
>> amount of data loading using that node as the coordinator node.
>In general you want to use all nodes to coordinate, not a single one.
>> Nodetool netstats never seems to show
>> any streaming data.  With past nodes it seemed like the node eventually
>> fixed itself.
>That node is storing hints for other nodes it believes are or were at
>some point in DOWN state. The first step to preventing this problem
>from recurring is to understand why it believes/d other nodes are
>down. My conjecture is that you are overloading the coordinating node
>and/or other nodes with the large amount of write.
>> Note that I am seeing severely degraded performance on this node when it
>> attempts to compact the HintsColumnFamily to the point where I had to set
>> setcompactionthroughput to 999 to ensure it doesn't run again (after which
>> the node started serving requests much faster).
>Depending on version, your 40gb of hints could be in one 40gb wide
>row. Look at nodetool cfstats for HintsColumnFamily to determine if
>this is the case.
>Do you see "Timed out replaying hint" messages, or are the hints being
>successfully delivered?
>You have two broad options :
>1) purge your hints and then either reload the data (if reloading it
>will be idempotent) or "repair -pr" on every node in the cluster.
>2) reduce load enough that hints will be successfully delivered,
>reduce gc_grace_seconds on the hints cf to 0 and then do a major
>If I were you, I would probably do 1). The easiest way is to stop the
>node and remove all sstables in the HintsColumnFamily.
>=Robert Coli
>YAHOO - rcoli.palominob
>SKYPE - rcoli_palominodb
View raw message