cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Sanderson <gra...@vast.com>
Subject Re: Cassandra Tuning Issue
Date Sun, 06 Dec 2015 16:40:06 GMT
What version of C* are you using; what JVM version - you showed a partial GC config but if
that is still CMS (not G1) then you are going to have insane GC pauses... 

Depending on C* versions are you using on/off heap memtables and what type

Those are the sorts of issues related to fat nodes; I'd be worried about - we run very nicely
at 20G total heap and 8G new - the rest of our 128G memory is disk cache/mmap and all of the
off heap stuff so it doesn't go to waste

That said I think Jack is probably on the right path with overloaded coordinators- though
you'd still expect to see CPU usage unless your timeouts are too low for the load, In which
case the coordinator would be getting no responses in time and quite possibly the other nodes
are just dropping the mutations (since they don't get to them before they know the coordinator
would have timed out) - I forget the command to check dropped mutations off the top of my
head but you can see it in opcenter

If you have GC problems you certainly
Expect to see GC cpu usage but depending on how long you run your tests it might take you
a little while to run thru 40G

I'm personally not a fan off >32G (ish) heaps as you can't do compressed oops and also
it is unrealistic for CMS ... The word is that G1 is now working ok with C* especially on
newer C* and JDK versions, but that said it takes quite a lot of thru-put to require insane
quantities of young gen... We are guessing that when we remove all our legacy thrift batch
inserts we will need less - and as for 20G total we actually don't need that much (we dropped
from 24 when we moved memtables off heap, and believe we can drop further)

Sent from my iPhone

> On Dec 6, 2015, at 9:07 AM, Jack Krupansky <jack.krupansky@gmail.com> wrote:
> 
> What replication factor are you using? Even if your writes use CL.ONE, Cassandra will
be attempting writes to the replica nodes in the background.
> 
> Are your writes "token aware"? If not, the receiving node has the overhead of forwarding
the request to the node that owns the token for the primary key.
> 
> For the record, Cassandra is not designed and optimized for so-called "fat nodes". The
design focus is "commodity hardware" and "distributed cluster" (typically a dozen or more
nodes.)
> 
> That said, it would be good if we had a rule of thumb for how many simultaneous requests
a node can handle, both external requests and inter-node traffic. I think there is an open
Jira to enforce a limit on inflight requests so that nodes don't overloaded and start failing
in the middle of writes as you seem to be seeing.
> 
> -- Jack Krupansky
> 
>> On Sun, Dec 6, 2015 at 9:29 AM, jerry <xutom2006@126.com> wrote:
>> Dear All,
>> 
>>     Now I have a 4 nodes Cassandra cluster, and I want to know the highest performance
of my Cassandra cluster. I write a JAVA client to batch insert datas into ALL 4 nodes Cassandra,
when I start less than 30 subthreads in my client applications to insert datas into cassandra,
it will be ok for everything, but when I start more than 80 or 100 subthreads in my client
applications, there will be too much timeout Exceptions (Such as: Cassandra timeout during
write query at consistency ONE (1 replica were required but only 0 acknowledged the write)).
And no matter how many subthreads or even I start multiple clients with multiple subthreads
on different computers, I can get the highest performance for about 60000 - 80000 TPS. By
the way, each row I insert into cassandra is about 130 Bytes.
>>     My 4 nodes of Cassandra is :
>>         CPU: 4*15
>>         Memory: 512G
>>         Disk: flash card (only one disk but better than SSD)
>>     My cassandra configurations are:
>>         MAX_HEAP_SIZE: 60G
>>         NEW_HEAP_SIZE: 40G
>> 
>>     When I insert datas into my cassandra cluster, each nodes has NOT reached bottleneck
such as CPU or Memory or Disk. Each of the three main hardwares is idle怂So I think maybe
there is something wrong about my configuration of cassandra cluster. Can somebody please
help me to My Cassandra Tuning? Thanks in advances!
> 

Mime
View raw message