incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Cassandra lockup (0.6.5) on bulk write
Date Wed, 06 Oct 2010 00:57:07 GMT
Some quick answers...

Cloud Kick monitoring may be helpful https://www.cloudkick.com/monitoring-features

rpc_timeout_in_ms is the timeout for communication between nodes in the cluster, not clients
and the cluster. 

AFAIK you should be setting timeouts on the Thrift socket. If you get a timeout then retry
on another node. Email the pelops peeps and ask them if there was a reason for not setting
one. 

Not sure about setting keep alive. 

Your tokens are different to the one calculated by the python function in the Load Balancing
section here http://wikiapache.org/cassandra/Operations . You may want to take another look
at them, I get 
tokens(5)
34028236692093846346337460743176821145
68056473384187692692674921486353642290
102084710076281539039012382229530463436
136112946768375385385349842972707284581
170141183460469231731687303715884105727

I'm not sure you are doing yourself any favors writing at CF ONE. It means you clients will
get an ack as soon as one node has written it's data, and you're leaving the rest of the nodes
to sort them selves out. Which *may not happen* if you are overloading the cluster. The overloaded
nodes could drop the messages or the message could timeout, which is fine because you've told
the cluster your happy with low consistency.

So after the load will want to run repair to make sure the data is correctly replicated. I
would suggest writing at CF.QUORUM or higher. And still running the repair after, when hopefully
it will have little to do. 

Hope that helps
Aaron
 

On 06 Oct, 2010,at 12:08 PM, Jason Horman <jhorman@gmail.com> wrote:

Yes, you are right. For some reason we didn't notice it. The Cassandra
server itself was up still, so the on machine monitoring for the
process didn't go off. nodetool shows it is unresponsive though. We
are still learning how to properly monitor. The machine that went down
ran out of disk space on Amazon EBS.

So I believe that our client was connected to that machine when it ran
into trouble. I am a little surprised that it wasn't disconnected, it
just hung forever. We are using Pelops, which doesn't seem to set a
thrift timeout. It also sets keep alive on the socket. Here is the
pelops connection code.

socket = new TSocket(nodeContext.node, port);
socket.getSocket().setKeepAlive(true);

Server side the default rpc timeout is used.
<RpcTimeoutInMillis>10000</RpcTimeoutInMillis>

Is RpcTimeoutInMillis supposed to have booted our client after 10s, or
is the server now just in a really bad state. Should I modify Pelops
to set a timeout on the TSocket. Is setKeepAlive recommended.

We are writing at consistency level ONE, replication factor is 4.
There are 5 cassandra servers at the moment but in production we will
run with more. This is on Amazon EC2/EBS so IO performance isn't
great. I think that the cluster appears unbalanced b/c of the high
replication factor.

If you are interested here is the stack trace from the machine that
ran out of space.

ERROR [COMMIT-LOG-WRITER] 2010-10-05 16:05:22,393 CassandraDaemon.java
(line 83) Uncaught exception in thread
Thread[COMMIT-LOG-WRITER,5,main]
java.lang.RuntimeException: java.lang.RuntimeException:
java.io.IOException: No space left on device
at org.apache.cassandra.utils.WrappedRunnablerun(WrappedRunnable.java:34)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.RuntimeException: java.io.IOException: No space
left on device
at org.apache.cassandra.db.commitlog.BatchCommitLogExecutorService.processWithSyncBatch(BatchCommitLogExecutorService.java:102)
at org.apache.cassandra.db.commitlog.BatchCommitLogExecutorService.access$000(BatchCommitLogExecutorService.java:31)
at org.apache.cassandra.db.commitlog.BatchCommitLogExecutorService$1.runMayThrow(BatchCommitLogExecutorService.java:49)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 1 more
Caused by: java.io.IOException: No space left on device
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
at org.apache.cassandra.io.util.BufferedRandomAccessFile.flushBuffer(BufferedRandomAccessFile.java:193)
at org.apache.cassandra.io.util.BufferedRandomAccessFilesync(BufferedRandomAccessFile.java:173)
at org.apache.cassandra.db.commitlog.CommitLogSegment.sync(CommitLogSegment.java:142)
at org.apache.cassandra.db.commitlog.CommitLog.sync(CommitLog.java:424)
at org.apache.cassandra.db.commitlog.BatchCommitLogExecutorService.processWithSyncBatch(BatchCommitLogExecutorService.java:98)
... 4 more
ERROR [COMPACTION-POOL:1] 2010-10-05 16:05:40,366 CassandraDaemon.java
(line 83) Uncaught exception in thread
Thread[COMPACTION-POOL:1,5,main]
java.util.concurrent.ExecutionException: java.io.IOException: No space
left on device
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
at org.apache.cassandra.db.CompactionManager$CompactionExecutor.afterExecute(CompactionManager.java:577)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: No space left on device
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
at org.apache.cassandra.io.util.BufferedRandomAccessFile.flushBuffer(BufferedRandomAccessFile.java:193)
at org.apache.cassandra.io.util.BufferedRandomAccessFile.seek(BufferedRandomAccessFile.java:239)
at org.apache.cassandra.io.util.BufferedRandomAccessFile.writeAtMost(BufferedRandomAccessFile.java:390)
at org.apache.cassandra.io.util.BufferedRandomAccessFile.write(BufferedRandomAccessFile.java:366)
at org.apache.cassandra.io.SSTableWriter.append(SSTableWriter.java:100)
at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:300)
at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:102)
at orgapache.cassandra.db.CompactionManager$1.call(CompactionManager.java:83)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
.. 2 more


On Tue, Oct 5, 2010 at 3:43 PM, Aaron Morton <aaron@thelastpickle.com> wrote:
> The cluster looks unbalanced (assuming the Random Partitioner), did you
> manually assign tokens to the nodes?  The section on Token Select here some
> some tips http://wiki.apache.org/cassandra/Operations
> One of the nodes in the cluster is down. Is there anything in the log to
> explain why ? You may have some other errors.
> Also want to check:
> - your client has a list of all of the clients, so it could move to another
> if it was connected to the down node.
> - what's the RF and what consistency level are you writing at.
> - how long is the hang?
> - what happening on the server while the client is hanging? e.g. is it idle
> or is the CPU going crazy, swapping, iostat
> - what timeout are you using with thrift?
>
> Aaron
> On 06 Oct, 2010,at 07:28 AM, Jason Horman <jhorman@gmail.com> wrote:
>
> We are experiencing some random hangs while importing data into
> Cassandra 0.6.5. The client stack dump is below. We are using Java
> Pelops with Thrift r917130. The hang seems random, sometimes millions
> of records in, sometimes just a few thousand. It sort of smells like
> the JIRA
>
> https://issues.apache.org/jira/browse/CASSANDRA-1175
>
> Has any one else experienced this? Any advice?
>
> Here is a dump from nodetool
>
> Address Status Load Range
> Ring
> 10.192.230.224Down 43.41 GB
> 25274261893111669883290654807978388961 |<--|
> 10.248.135.223Up 29.38 GB
> 34662916595519283353151730886201323030 | ^
> 10.209.125.235Up 19.83 GB
> 45387569059876439228162547977665761954 v |
> 10.206.209.112Up 23.59 GB
> 105389616365686887162471812716889564402 | ^
> 10.209.22.3 Up 33.16 GB
> 148562884084359545011181864444489491335 |-->|
>
> Here is the stack
>
> "RMI TCP Connection(4)-10.246.55223" daemon prio=10
> tid=0x00002aaac0194000 nid=0x53b3 runnable [0x000000004b7dc000]
>    java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> - locked <0x000000074d23e978> (a java.io.BufferedInputStream)
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:126)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at
> org.apache.thrifttransport.TFramedTransport.readFrame(TFramedTransport.java:92)
> at
> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:85)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at
> org.apache.thriftprotocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
> at
> orgapache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
> at
> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:794)
> at
> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:777)
> at org.wyki.cassandra.pelops.Mutator$1.execute(Mutator.java:40)
>



-- 
-jason

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message