cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Tuning cassandra (compactions overall)
Date Tue, 22 May 2012 10:53:53 GMT
not sure what you mean by 
> And after restarting the second one I have lost all the consistency of
> my data. All my statistics since September are totally false now in
> production

Can you give some examples?
Counter are not idempotent so if the client app retries TimedOut requests you can get an over
count. That should not result in lost data.

> As reminder I'm using a 2 node cluster RF=2, CL.ONE

Have you been running repair ? 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/05/2012, at 1:27 AM, Alain RODRIGUEZ wrote:

> Hi Aaron.
> 
> I wanted to try the new config. After doing a rolling restart I have
> all my counters false, with wrong values. I stopped my servers with
> the following :
> 
> nodetool -h localhost disablegossip
> nodetool -h localhost disablethrift
> nodetool -h localhost drain
> kill cassandra sigterm (15) via htop
> 
> And after restarting the second one I have lost all the consistency of
> my data. All my statistics since September are totally false now in
> production.
> 
> As reminder I'm using a 2 node cluster RF=2, CL.ONE
> 
> 1 - How to fix it ? (I have a backup from this morning, but I will
> lose all the data after this date if I restore this backup)
> 2 - What happened ? How to avoid it ?
> 
> Any Idea would be greatly appreciated, I'm quite desperated.
> 
> Alain
> 
> 2012/5/17 aaron morton <aaron@thelastpickle.com>:
>> What is the the benefit of having more memory ? I mean, I don't
>> 
>> understand why having 1, 2, 4, 8 or 16 GB of memory is so different.
>> 
>> Less frequent and less aggressive garbage collection frees up CPU resources
>> to run the database.
>> 
>> Less memory results in frequent and aggressive (i.e. stop the world) GC, and
>> increase IO pressure. Which reduces read performance and in the extreme can
>> block writes.
>> 
>> The memory used inside
>> 
>> the heap will remains close to the max memory available, therefore
>> having more or less memory doesn't matter.
>> 
>> Not an ideal situation. Becomes difficult to find an contiguous region of
>> memory to allocate.
>> 
>> Can you enlighten me about this point ?
>> 
>> It's a database server, it is going to work better with more memory. Also
>> it's Java and it's designed to run on multiple machines with many GB's of
>> ram available. There are better arguments
>> here http://wiki.apache.org/cassandra/CassandraHardware
>> 
>> 
>> I'm interested a lot in learning about some configuration I can use to
>> reach better peformance/stability as well as in learning about how
>> Cassandra works.
>> 
>> Turn off all caches.
>> 
>> In the schema increase the bloom filter false positive rate (see help in the
>> cli for Create column family)
>> 
>> In the yaml experiment with these changes:
>> * reduce sliced_buffer_size_in_kb
>> * reduce column_index_size_in_kb
>> * reduce in_memory_compaction_limit_in_mb
>> * increase index_interval
>> * set concurrent_compactors to 2
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 17/05/2012, at 12:40 AM, Alain RODRIGUEZ wrote:
>> 
>> Using c1.medium, we are currently able to deliver the service.
>> 
>> What is the the benefit of having more memory ? I mean, I don't
>> understand why having 1, 2, 4, 8 or 16 GB of memory is so different.
>> In my mind, Cassandra will fill the heap and from then, start to flush
>> and compact to avoid OOMing and fill it again. The memory used inside
>> the heap will remains close to the max memory available, therefore
>> having more or less memory doesn't matter.
>> 
>> I'm pretty sure I misunderstand or forget something about how the
>> memory is used but not sure about what.
>> 
>> Can you enlighten me about this point ?
>> 
>> If I understand why the memory size is that important I will probably
>> be able to argue about the importance of having more memory and my
>> boss will probably allow me to spend more money to get better servers.
>> 
>> "There are some changes you can make to mitigate things (let me know
>> if you need help), but this is essentially a memory problem."
>> 
>> I'm interested a lot in learning about some configuration I can use to
>> reach better peformance/stability as well as in learning about how
>> Cassandra works.
>> 
>> Thanks for the help you give to people and for sharing your knowledge
>> with us. I appreciate a lot the Cassandra community and the most
>> active people keeping it alive. It's worth being said :).
>> 
>> Alain
>> 
>> 


Mime
View raw message