cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Menegatti <gabr...@s1mbi0se.com.br>
Subject Re: Cassandra DC2 nodes down after increasing write requests on DC1 nodes
Date Sun, 16 Nov 2014 15:25:29 GMT
Hi Eric,

Thanks for your reply.

I said that load was not a big deal, because ops center shows this loads as green, not as
yellow or red at all.

Also, our servers have many processors/threads, so I *think* this load is not problematic.

My assumption is that for some reason the DC2 10 nodes are not being able to handle the volume
of requests from DC1, as it was 30 nodes. Even so, on my point of view the load of the DC2
nodes should go really high before Cassandra goes down, but its not doing so.

Regards,
Gabriel

Enviado pelo celular / Sent from mobile.

> Em 16/11/2014, às 12:25, Eric Stevens <mightye@gmail.com> escreveu:
> 
> > load average on DC1 nodes are around 3-5 and on DC2 around 7-10
> 
> Anecdotally I can say that loads in the 7-10 range have been dangerously high.  When
we had a cluster running in this range, the cluster was falling behind on important tasks
such as compaction, and we really struggled to successfully bootstrap or repair in that DC
(2.1.1 cluster).
>> On Sun Nov 16 2014 at 6:49:31 AM Gabriel Menegatti <gabriel@s1mbi0se.com.br>
wrote:
>> Hello,
>> 
>> We are using Cassandra 2.1.2 in a multi dc cluster (30 servers on DC1 and 10 on DC2)
with a key space replication factor of 1 on DC1 and 2 on DC2.
>> 
>> For some reason when we increase the volume of write requests on DC1 (using ONE or
LOCAL_ONE), the Cassandra java process on DC2 nodes goes down randomly.
>> 
>> At the time DC2 nodes starts to go down, the load average on DC1 nodes are around
3-5 and on DC2 around 7-10.. so not big deal.
>> 
>> Taking a look at the Cassandra's system.log, we found some exceptions:
>> 
>> ERROR [SharedPool-Worker-43] 2014-11-15 00:39:48,596 JVMStabilityInspector.java:94
- JVM state determined to be unstable.  Exiting forcefully due to:
>> java.lang.OutOfMemoryError: Java heap space
>> ERROR [CompactionExecutor:8] 2014-11-15 00:39:48,596 CassandraDaemon.java:153 - Exception
in thread Thread[CompactionExecutor:8,1,main]
>> java.lang.OutOfMemoryError: Java heap space
>> ERROR [Thrift-Selector_2] 2014-11-15 00:39:48,596 Message.java:238 - Got an IOException
during write!
>> java.io.IOException: Broken pipe
>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_25]
>>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_25]
>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_25]
>>         at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_25]
>>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470) ~[na:1.8.0_25]
>>         at org.apache.thrift.transport.TNonblockingSocket.write(TNonblockingSocket.java:164)
~[libthrift-0.9.1.jar:0.9.1]
>>         at com.thinkaurelius.thrift.util.mem.Buffer.writeTo(Buffer.java:104) ~[thrift-server-0.3.7.jar:na]
>>         at com.thinkaurelius.thrift.util.mem.FastMemoryOutputTransport.streamTo(FastMemoryOutputTransport.java:112)
~[thrift-server-0.3.7.jar:na]
>>         at com.thinkaurelius.thrift.Message.write(Message.java:222) ~[thrift-server-0.3.7.jar:na]
>>         at com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.handleWrite(TDisruptorServer.java:598)
[thrift-server-0.3.7.jar:na]
>>         at com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.processKey(TDisruptorServer.java:569)
[thrift-server-0.3.7.jar:na]
>>         at com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.select(TDisruptorServer.java:423)
[thrift-server-0.3.7.jar:na]
>>         at com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.run(TDisruptorServer.java:383)
[thrift-server-0.3.7.jar:na]
>> ERROR [Thread-94] 2014-11-15 00:39:48,597 CassandraDaemon.java:153 - Exception in
thread Thread[Thread-94,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>>         at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107) ~[na:1.8.0_25]
>>         at org.apache.cassandra.db.composites.AbstractCType.sliceBytes(AbstractCType.java:369)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:101)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:397)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:381)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.composites.AbstractCellNameType$5.deserialize(AbstractCellNameType.java:117)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.composites.AbstractCellNameType$5.deserialize(AbstractCellNameType.java:109)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:106)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:101)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:110)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:322)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:302)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:272)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99) ~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:168)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:150)
~[apache-cassandra-2.1.2.jar:2.1.2]
>>         at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:82)
~[apache-cassandra-2.1.2.jar:2.1.2]
>> 
>> 
>> Memory:
>> - DC1 servers have 32 GB of RAM and the HEAP is configured to 8 GB.
>> - DC2 servers have 16 GB of RAM and the HEAP is also configured to 8 GB.
>> 
>> Please, any hint?
>> 
>> Thanks in advance.
>> 
>> Gabriel.

Mime
View raw message