cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phillip Michalak <phil.micha...@digitalreasoning.com>
Subject Re: quorum read timeout
Date Thu, 20 Aug 2009 23:15:53 GMT
I can't reproduce this after cleaning out the data/ directories (and  
commit logs) and restarting the cluster. I suspect that I had gotten  
to an inconsistent state as I was playing around with the keyspace  
column families.

I've updated the bug report with a comment to this effect, but have  
left it open for someone with more experience to close it out.

Cheers,
Phil

On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:

> Sure thing. Filed as  https://issues.apache.org/jira/browse/CASSANDRA-381
>
> Thanks,
> Phil
>
> On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:
>
>> Looks like a bug in TcpConnectionManager.  Can you file a ticket?
>>
>> thanks,
>>
>> -Jonathan
>>
>> On Wed, Aug 19, 2009 at 2:49 PM, Phillip
>> Michalak<phil.michalak@digitalreasoning.com> wrote:
>>> It's cassandra-0.4-beta1.
>>>
>>> Thanks!
>>> Phil
>>>
>>> On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:
>>>
>>>> Is this 0.3 or 0.4/trunk?
>>>>
>>>> On Wed, Aug 19, 2009 at 2:36 PM, Phillip
>>>> Michalak<phil.michalak@digitalreasoning.com> wrote:
>>>>>
>>>>> I'm running three Cassandra nodes in virtual machines.
>>>>> During a 'get' operation from Cassandra-remote directed at one  
>>>>> of these
>>>>> nodes, I'm receiving the following output
>>>>>
>>>>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra- 
>>>>> remote -h
>>>>> 192.168.133.130:9160 get 'MockElementLibrary'  
>>>>> '0401318uuuuruepwdcznr'
>>>>> "ColumnPath('strings', None, 'id')" 2
>>>>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
>>>>> DeprecationWarning: BaseException.message has been deprecated as  
>>>>> of
>>>>> Python
>>>>> 2.6
>>>>>  self.message = message
>>>>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
>>>>> DeprecationWarning: BaseException.message has been deprecated as  
>>>>> of
>>>>> Python
>>>>> 2.6
>>>>>  self.message = iprot.readString();
>>>>> Traceback (most recent call last):
>>>>>  File "interface/gen-py/cassandra/Cassandra-remote", line 93, in  
>>>>> <module>
>>>>>     
>>>>> pp 
>>>>> .pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>>>>>  File
>>>>>
>>>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/ 
>>>>> Cassandra.py",
>>>>> line 182, in get
>>>>>    return self.recv_get()
>>>>>  File
>>>>>
>>>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/ 
>>>>> Cassandra.py",
>>>>> line 201, in recv_get
>>>>>    raise x
>>>>>
>>>>> thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist- 
>>>>> packages/thrift/Thrift.py:76:
>>>>> DeprecationWarning: BaseException.message has been deprecated as  
>>>>> of
>>>>> Python
>>>>> 2.6
>>>>>  if self.message:
>>>>> /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
>>>>> DeprecationWarning: BaseException.message has been deprecated as  
>>>>> of
>>>>> Python
>>>>> 2.6
>>>>>  return self.message
>>>>> : Internal error processing get
>>>>>
>>>>> The same 'get' operation from Cassandra-remote directed at  
>>>>> another of
>>>>> these
>>>>> nodes, yields 'normal' output
>>>>> vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra- 
>>>>> remote -h
>>>>> 192.168.133.129:9160 get 'MockElementLibrary'  
>>>>> '0401318uuuuruepwdcznr'
>>>>> "ColumnPath('strings', None, 'id')" 2
>>>>> Traceback (most recent call last):
>>>>>  File "interface/gen-py/cassandra/Cassandra-remote", line 93, in  
>>>>> <module>
>>>>>     
>>>>> pp 
>>>>> .pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
>>>>>  File
>>>>>
>>>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/ 
>>>>> Cassandra.py",
>>>>> line 182, in get
>>>>>    return self.recv_get()
>>>>>  File
>>>>>
>>>>> "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/ 
>>>>> Cassandra.py",
>>>>> line 210, in recv_get
>>>>>    raise result.nfe
>>>>> ttypes.NotFoundException: NotFoundException()
>>>>> Furthermore, querying the same column for (some) other keys is  
>>>>> successful
>>>>> when no matter which node it is directed at.
>>>>> Looking at the log for the node that produced the error from the  
>>>>> query
>>>>> above:
>>>>>
>>>>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618  
>>>>> CassandraServer.java
>>>>> (line
>>>>> 221) get
>>>>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618  
>>>>> StorageProxy.java (line
>>>>> 420) strongread reading data for
>>>>> SliceByNamesReadCommand(table='MockElementLibrary',
>>>>> key='0401318uuuuruepwdcznr',
>>>>> columnParent='QueryPath(columnFamilyName='strings',
>>>>> superColumnName='null',
>>>>> columnName='null')', columns=[id,]) from 38184@null
>>>>> DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619  
>>>>> StorageProxy.java (line
>>>>> 427) strongread reading digest for
>>>>> SliceByNamesReadCommand(table='MockElementLibrary',
>>>>> key='0401318uuuuruepwdcznr',
>>>>> columnParent='QueryPath(columnFamilyName='strings',
>>>>> superColumnName='null',
>>>>> columnName='null')', columns=[id,]) from  
>>>>> 38185@192.168.133.129:7000
>>>>>  WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
>>>>> MessageSerializationTask.java (line 81) Exception was generated  
>>>>> at :
>>>>> 08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
>>>>> java.lang.NullPointerException
>>>>>        at
>>>>> org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:

>>>>> 83)
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra 
>>>>> .net 
>>>>> .TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra 
>>>>> .net.MessagingService.getConnection(MessagingService.java:306)
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra 
>>>>> .net.MessageSerializationTask.run(MessageSerializationTask.java: 
>>>>> 66)
>>>>>        at
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor 
>>>>> $Worker.runTask(ThreadPoolExecutor.java:886)
>>>>>        at
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor 
>>>>> $Worker.run(ThreadPoolExecutor.java:908)
>>>>>        at java.lang.Thread.run(Thread.java:619)
>>>>> DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622  
>>>>> ResponseVerbHandler.java
>>>>> (line 38) Processing response on a callback
>>>>> from 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000
>>>>> ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java  
>>>>> (line
>>>>> 608)
>>>>> Internal error processing get
>>>>> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
>>>>> Operation
>>>>> timed out - received only 1 responses from 192.168.133.129:7000 .
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra 
>>>>> .service.CassandraServer.readColumnFamily(CassandraServer.java: 
>>>>> 100)
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra.service.CassandraServer.get(CassandraServer.java:226)
>>>>>        at
>>>>>
>>>>> org.apache.cassandra.service.Cassandra$Processor 
>>>>> $get.process(Cassandra.java:602)
>>>>>        at
>>>>>
>>>>> org.apache.cassandra.service.Cassandra 
>>>>> $Processor.process(Cassandra.java:560)
>>>>>        at
>>>>>
>>>>> org.apache.thrift.server.TThreadPoolServer 
>>>>> $WorkerProcess.run(TThreadPoolServer.java:252)
>>>>>        at
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor 
>>>>> $Worker.runTask(ThreadPoolExecutor.java:886)
>>>>>        at
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor 
>>>>> $Worker.run(ThreadPoolExecutor.java:908)
>>>>>        at java.lang.Thread.run(Thread.java:619)
>>>>> Caused by: java.util.concurrent.TimeoutException: Operation  
>>>>> timed out -
>>>>> received only 1 responses from 192.168.133.129:7000 .
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra 
>>>>> .service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra.service.StorageProxy.readProtocol(StorageProxy.java: 
>>>>> 330)
>>>>>        at
>>>>>
>>>>> org 
>>>>> .apache 
>>>>> .cassandra 
>>>>> .service.CassandraServer.readColumnFamily(CassandraServer.java:92)
>>>>>        ... 7 more
>>>>>
>>>>> It appears to me that there is a timeout during the  
>>>>> QuorumResponseHandler
>>>>> processing, stemming from a NullPointerException that occurs as  
>>>>> part of
>>>>> the
>>>>> read process. I suspect that this NullPointerException has  
>>>>> something to
>>>>> do
>>>>> with the second DEBUG [pool-1-thread-22] comment regarding  
>>>>> strongread ...
>>>>> from 38184@null.
>>>>> Does anyone know why this might be happening?
>>>>> Thanks for any insight,
>>>>> Phil
>>>>>
>>>
>>>
>


Mime
View raw message