incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish Tyagi <tyagi.i...@gmail.com>
Subject Re: High loads only on one node in the cluster
Date Fri, 01 Nov 2013 09:43:56 GMT
Hi Evan,

The clients connect to all nodes. We tried shutting the thrift server on
the affected node. Loads did not come down.



On Fri, Nov 1, 2013 at 12:59 AM, Evan Weaver <evan@fauna.org> wrote:

> Are all your clients only connecting to your first node? I would
> probably strace it and compare the trace to one from a lightly loaded
> node.
>
> On Thu, Oct 31, 2013 at 7:12 PM, Ashish Tyagi <tyagi.iitr@gmail.com>
> wrote:
> > We have a 9 node cluster. 6 nodes are in one data-center and 3 nodes in
> the
> > other. All machines are Amazon M1.XLarge configuration.
> >
> > Datacenter: DC1
> > ==========
> > Address         Rack        Status State   Load            Owns
> > Token
> >
> > ip11  1b          Up     Normal  76.46 GB        16.67%              0
> > ip12  1b          Up     Normal  44.66 GB        16.67%
> > 28356863910078205288614550619314017621
> > ip13  1c          Up     Normal  85.94 GB        16.67%
> > 56713727820156410577229101238628035241
> > ip14  1c          Up     Normal  17.55 GB        16.67%
> > 85070591730234615865843651857942052863
> > ip15  1d          Up     Normal  80.74 GB        16.67%
> > 113427455640312821154458202477256070484
> > ip16  1d          Up     Normal  20.88 GB        16.67%
> > 141784319550391026443072753096570088105
> >
> > Datacenter: DC2
> > ==========
> > Address         Rack        Status State   Load            Owns
> > Token
> >
> > ip21  1a          Up     Normal  78.32 GB        0.00%               1001
> > ip22  1b          Up     Normal  71.23 GB        0.00%
> > 56713727820156410577229101238628036241
> > ip23  1b          Up     Normal  53.49 GB        0.00%
> > 113427455640312821154458202477256071484
> >
> > Problem is that node with ip address: ip11 often has 5-10 times more load
> > than any other node. Most of the operations are on counters. The primary
> > column family (which receives most writes) has a replication factor of 2
> in
> > DataCenter DC1 and also in DataCenter DC2. The traffic is write heavy
> (reads
> > are less than 10% of total requests). We are using size-tiered
> compaction.
> > Both writes and reads happen with a consistency factor of LOCAL_QUORUM.
> >
> > More information:
> >
> > 1. cassandra.yaml - http://pastebin.com/u344fA6z
> > 2. Jmap heap when node under high loads - http://pastebin.com/ib3D0Pa
> > 3. Nodetool tpstats - http://pastebin.com/s0AS7bGd
> > 4. Cassandra-env.sh - http://pastebin.com/ubp4cGUx
> > 5. GC log lines -  http://pastebin.com/Y0TKphsm
> >
> > Am I doing anything wrong. Any pointers will be appreciated.
> >
> > Thanks in advance,
> > Ashish
>

Mime
View raw message