incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <da...@citypath.com>
Subject Re: Replication-aware compaction
Date Tue, 07 Jun 2011 11:47:45 GMT
Thanks! I'm actually on vacation now, so I hope to look into this next week.

On Mon, Jun 6, 2011 at 10:25 PM, aaron morton <aaron@thelastpickle.com> wrote:
> You should consider upgrading to 0.7.6 to get a fix to Gossip. Earlier 0.7 releases were
prone to marking nodes up and down when they should not have been. See https://github.com/apache/cassandra/blob/cassandra-0.7/CHANGES.txt#L22
>
> Are the TimedOutExceptions to the client for read or write requests ? During the burst
times which stages are backing up  nodetool tpstats ? Compaction should not affect writes
too much (assuming different log and data spindles).
>
> You could also take a look at the read and write latency stats for a particular CF using
nodetool cfstats or JConsole. These will give you the stats for the local operations. You
could also take a look at the iostats on the box http://spyced.blogspot.com/2010/01/linux-performance-basics.html
>
> Hope that helps.
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 7 Jun 2011, at 00:30, David Boxenhorn wrote:
>
>> Version 0.7.3.
>>
>> Yes, I am talking about minor compactions. I have three nodes, RF=3.
>> 3G data (before replication). Not many users (yet). It seems like 3
>> nodes should be plenty. But when all 3 nodes are compacting, I
>> sometimes get timeouts on the client, and I see in my logs that each
>> one is full of notifications that the other nodes have died (and come
>> back to life after about a second). My cluster can tolerate one node
>> being out of commission, so I would rather have longer compactions one
>> at a time than shorter compactions all at the same time.
>>
>> I think that our usage pattern of bursty writes causes the three nodes
>> to decide to compact at the same time. These bursts are followed by
>> periods of relative quiet, so there should be time for the other two
>> nodes to compact one at a time.
>>
>>
>> On Mon, Jun 6, 2011 at 3:27 PM, David Boxenhorn <david@citypath.com> wrote:
>>>
>>> Version 0.7.3.
>>>
>>> Yes, I am talking about minor compactions. I have three nodes, RF=3. 3G data
(before replication). Not many users (yet). It seems like 3 nodes should be plenty. But when
all 3 nodes are compacting, I sometimes get timeouts on the client, and I see in my logs that
each one is full of notifications that the other nodes have died (and come back to life after
about a second). My cluster can tolerate one node being out of commission, so I would rather
have longer compactions one at a time than shorter compactions all at the same time.
>>>
>>> I think that our usage pattern of bursty writes causes the three nodes to decide
to compact at the same time. These bursts are followed by periods of relative quiet, so there
should be time for the other two nodes to compact one at a time.
>>>
>>>
>>> On Mon, Jun 6, 2011 at 2:36 PM, aaron morton <aaron@thelastpickle.com>
wrote:
>>>>
>>>> Are you talking about minor (automatic) compactions ? Can you provide some
more information on what's happening to make the node unusable and what version you are using?
It's not lightweight process, but it should not hurt the node that badly. It is considered
an online operation.
>>>>
>>>> Delaying compaction will only make it run for longer and take more resources.
>>>>
>>>> Cheers
>>>>
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>
>>>> On 6 Jun 2011, at 20:14, David Boxenhorn wrote:
>>>>
>>>>> Is there some deep architectural reason why compaction can't be
>>>>> replication-aware?
>>>>>
>>>>> What I mean is, if one node is doing compaction, its replicas
>>>>> shouldn't be doing compaction at the same time. Or, at least a quorum
>>>>> of nodes should be available at all times.
>>>>>
>>>>> For example, if RF=3, and one node is doing compaction, the nodes to
>>>>> its right and left in the ring should wait on compaction until that
>>>>> node is done.
>>>>>
>>>>> Of course, my real problem is that compaction makes a node pretty much
>>>>> unavailable. If we can fix that problem then this is not necessary.
>>>>
>>>
>
>

Mime
View raw message