incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Large number of pending gossip stage tasks in nodetool tpstats
Date Thu, 08 Aug 2013 02:22:32 GMT
>  When looking at nodetool
> gossipinfo, I notice that this node has updated to the latest schema hash, but
> that it thinks other nodes in the cluster are on the older version.
What does describe cluster in cassandra-cli say ? It will let you know if there are multiple
schema versions in the cluster. 

Can you include the output from nodetool gossipinfo ? 

You may also get some value from increase the log level for org.apache.cassandra.gms.Gossiper
to DEBUG so you can see what's going on. It's unusual for only the gossip pool to backup.
If there were issues with GC taking CPU we would expect to see it across the board. 

Cheers



-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/08/2013, at 7:52 AM, Faraaz Sareshwala <fsareshwala@quantcast.com> wrote:

> I'm running cassandra-1.2.8 in a cluster with 45 nodes across three racks. All
> nodes are well behaved except one. Whenever I start this node, it starts
> churning CPU. Running nodetool tpstats, I notice that the number of pending
> gossip stage tasks is constantly increasing [1]. When looking at nodetool
> gossipinfo, I notice that this node has updated to the latest schema hash, but
> that it thinks other nodes in the cluster are on the older version. I've tried
> to drain, decommission, wipe node data, bootstrap, and repair the node. However,
> the node just started doing the same thing again.
> 
> Has anyone run into this issue before? Can anyone provide any insight into why
> this node is the only one in the cluster having problems? Are there any easy
> fixes?
> 
> Thank you,
> Faraaz
> 
> [1] $ /cassandra/bin/nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked  All time blocked
> ReadStage                         0         0              8         0              
  0
> RequestResponseStage              0         0          49198         0              
  0
> MutationStage                     0         0         224286         0              
  0
> ReadRepairStage                   0         0              0         0              
  0
> ReplicateOnWriteStage             0         0              0         0              
  0
> GossipStage                       1      2213             18         0              
  0
> AntiEntropyStage                  0         0              0         0              
  0
> MigrationStage                    0         0             72         0              
  0
> MemtablePostFlusher               0         0            102         0              
  0
> FlushWriter                       0         0             99         0              
  0
> MiscStage                         0         0              0         0              
  0
> commitlog_archiver                0         0              0         0              
  0
> InternalResponseStage             0         0             19         0              
  0
> HintedHandoff                     0         0              2         0              
  0
> 
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> BINARY                       0
> READ                         0
> MUTATION                     0
> _TRACE                       0
> REQUEST_RESPONSE             0


Mime
View raw message