incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Several 'TimedOutException' in stress.py
Date Wed, 09 Mar 2011 00:49:18 GMT
Cool, so it's a server side because

- in the client side stack the thrift code is raising the error 
- server side log has this DEBUG 22:29:10,318 ... timed out

The TimedOutException is raised when the number of replicas required by your CL have not returned
inside the timespan specified by rpc_timeout in conf/cassandra.yaml. 

In general this means your cluster cannot keep up or there is some sort of problem. There
can be a number of reasons by things may be going slow, look into:
- the logs on other machines and see if they have messages like "Dropped {} {} messages in
the last {}ms" . This means the message was delivered but not processed in time. 
- check IO performance http://spyced.blogspot.com/2010/01/linux-performance-basics.html
- check cassandra thread pools to see if things are backing up nodetool tpstats

Hope that helps. 
Aaron


On 9/03/2011, at 11:33 AM, A J wrote:

> Client side (it is just a 5th instance in the same EC2 zone, having
> stress.py installed on it) gives the following error:
> 
> Process Inserter-4:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-1:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-3:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-8:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-2:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-6:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-5:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-7:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-9:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> Process Inserter-10:
> Traceback (most recent call last):
>  File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in
> _bootstrap
>    self.run()
>  File "stress.py", line 238, in run
>    self.cclient.batch_mutate(cfmap, consistency)
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 873, in batch_mutate
>    self.recv_batch_mutate()
>  File "/home/ec2-user/cassandra/interface/thrift/gen-py/cassandra/Cassandra.py",
> line 899, in recv_batch_mutate
>    raise result.te
> TimedOutException: TimedOutException()
> 
> The related server side errors look like:
> DEBUG 22:29:04,407 Deleting CommitLog-1299623301883.log.header
> DEBUG 22:29:04,412 Deleting CommitLog-1299623301883.log
> DEBUG 22:29:04,443 Deleting CommitLog-1299623318627.log.header
> DEBUG 22:29:04,443 Deleting CommitLog-1299623318627.log
> DEBUG 22:29:09,202 ... timed out
> DEBUG 22:29:09,426 ... timed out
> DEBUG 22:29:10,318 ... timed out
> DEBUG 22:29:11,354 logged out: #<User allow_all groups=[]>
> DEBUG 22:29:11,354 logged out: #<User allow_all groups=[]>
> DEBUG 22:29:11,354 logged out: #<User allow_all groups=[]>
> DEBUG 22:29:12,442 Processing response on a callback from 784@/10.253.203.224
> DEBUG 22:29:12,443 Processing response on a callback from 786@/10.253.203.224
> DEBUG 22:29:12,443 Processing response on a callback from 791@/10.253.203.224
> 
> 
> 
> On Tue, Mar 8, 2011 at 3:22 PM, aaron morton <aaron@thelastpickle.com> wrote:
>> Is this a client side time out or a server side one? What does the error
>> stack look like ?
>> Also check the server side logs for errors. The thrift API will raise a
>> timeout when less the CL level of nodes return in rpc_timeout.
>> Good luck
>> Aaron
>> On 9/03/2011, at 7:37 AM, ruslan usifov wrote:
>> 
>> 
>> 2011/3/8 A J <s5alye@gmail.com>
>>> 
>>> Trying out stress.py on AWS EC2 environment (4 Large instances. Each
>>> of 2-cores and 7.5GB RAM. All in the same region/zone.)
>>> 
>>> python stress.py -o insert  -d
>>> 10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t
>>> 10 -n 500 -S 1000000 -k
>>> 
>>> (I want to try with column size of about 1MB. I am assuming the above
>>> gives me 10 parallel threads each executing 50 inserts sequentially
>>> (500/10) ).
>>> 
>>> Getting several timeout errors.TimedOutException(). With just 10
>>> concurrent writes spread across 4 nodes, kind of surprised to get so
>>> many timeouts. Any suggestions ?
>>> 
>> 
>> 
>> It may by EC2 disc speed degradation (io speed of EC2 instances doesnt
>> const, also can vary in greater limits)
>> 
>> 


Mime
View raw message