incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@almaden.ibm.com>
Subject Re: quorum read timeout
Date Thu, 20 Aug 2009 23:37:47 GMT

You should be aware that Cassandra doesn't support changing column families
without reloading the data.

Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA  95120-6099

junrao@almaden.ibm.com



|------------>
| From:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Phillip Michalak <phil.michalak@digitalreasoning.com>                           
                                                                 |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |cassandra-user@incubator.apache.org                                                   
                                                           |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |08/20/2009 04:16 PM                                                                   
                                                           |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >--------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: quorum read timeout                                                               
                                                           |
  >--------------------------------------------------------------------------------------------------------------------------------------------------|





I can't reproduce this after cleaning out the data/ directories (and commit
logs) and restarting the cluster. I suspect that I had gotten to an
inconsistent state as I was playing around with the keyspace column
families.

I've updated the bug report with a comment to this effect, but have left it
open for someone with more experience to close it out.

Cheers,
Phil

On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:

      Sure thing. Filed as
      https://issues.apache.org/jira/browse/CASSANDRA-381

      Thanks,
      Phil

      On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:

            Looks like a bug in TcpConnectionManager.  Can you file a
            ticket?

            thanks,

            -Jonathan

            On Wed, Aug 19, 2009 at 2:49 PM, Phillip
            Michalak<phil.michalak@digitalreasoning.com> wrote:
                  It's cassandra-0.4-beta1.

                  Thanks!
                  Phil

                  On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:

                        Is this 0.3 or 0.4/trunk?

                        On Wed, Aug 19, 2009 at 2:36 PM, Phillip
                        Michalak<phil.michalak@digitalreasoning.com> wrote:

                              I'm running three Cassandra nodes in virtual
                              machines.
                              During a 'get' operation from
                              Cassandra-remote directed at one of these
                              nodes, I'm receiving the following output

                              vadmin@vadmin:~/cassandra$
                              interface/gen-py/cassandra/Cassandra-remote
                              -h
                              192.168.133.130:9160 get 'MockElementLibrary'
                              '0401318uuuuruepwdcznr'
                              "ColumnPath('strings', None, 'id')" 2
                              /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
                              DeprecationWarning: BaseException.message has
                              been deprecated as of
                              Python
                              2.6
                               self.message = message
                              /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
                              DeprecationWarning: BaseException.message has
                              been deprecated as of
                              Python
                              2.6
                               self.message = iprot.readString();
                              Traceback (most recent call last):
                               File
                              "interface/gen-py/cassandra/Cassandra-remote",
 line 93, in <module>
                                 pp.pprint(client.get(args[0],args[1],eval
                              (args[2]),eval(args[3]),))
                               File

                              "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                              line 182, in get
                                 return self.recv_get()
                               File

                              "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                              line 201, in recv_get
                                 raise x

                              thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
                              DeprecationWarning: BaseException.message has
                              been deprecated as of
                              Python
                              2.6
                               if self.message:
                              /usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
                              DeprecationWarning: BaseException.message has
                              been deprecated as of
                              Python
                              2.6
                               return self.message
                              : Internal error processing get

                              The same 'get' operation from
                              Cassandra-remote directed at another of
                              these
                              nodes, yields 'normal' output
                              vadmin@vadmin:~/cassandra$
                              interface/gen-py/cassandra/Cassandra-remote
                              -h
                              192.168.133.129:9160 get 'MockElementLibrary'
                              '0401318uuuuruepwdcznr'
                              "ColumnPath('strings', None, 'id')" 2
                              Traceback (most recent call last):
                               File
                              "interface/gen-py/cassandra/Cassandra-remote",
 line 93, in <module>
                                 pp.pprint(client.get(args[0],args[1],eval
                              (args[2]),eval(args[3]),))
                               File

                              "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                              line 182, in get
                                 return self.recv_get()
                               File

                              "/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
                              line 210, in recv_get
                                 raise result.nfe
                              ttypes.NotFoundException: NotFoundException()
                              Furthermore, querying the same column for
                              (some) other keys is successful
                              when no matter which node it is directed at.
                              Looking at the log for the node that produced
                              the error from the query
                              above:

                              DEBUG [pool-1-thread-22] 2009-08-19
                              16:54:57,618 CassandraServer.java
                              (line
                              221) get
                              DEBUG [pool-1-thread-22] 2009-08-19
                              16:54:57,618 StorageProxy.java (line
                              420) strongread reading data for
                              SliceByNamesReadCommand
                              (table='MockElementLibrary',
                              key='0401318uuuuruepwdcznr',
                              columnParent='QueryPath
                              (columnFamilyName='strings',
                              superColumnName='null',
                              columnName='null')', columns=[id,]) from
                              38184@null
                              DEBUG [pool-1-thread-22] 2009-08-19
                              16:54:57,619 StorageProxy.java (line
                              427) strongread reading digest for
                              SliceByNamesReadCommand
                              (table='MockElementLibrary',
                              key='0401318uuuuruepwdcznr',
                              columnParent='QueryPath
                              (columnFamilyName='strings',
                              superColumnName='null',
                              columnName='null')', columns=[id,]) from
                              38185@192.168.133.129:7000
                               WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19
                              16:54:57,619
                              MessageSerializationTask.java (line 81)
                              Exception was generated at :
                              08/19/2009 16:54:57 on thread
                              MESSAGE-SERIALIZER-POOL:4
                              java.lang.NullPointerException
                                     at
                              org.apache.cassandra.net.TcpConnection.<init>
                              (TcpConnection.java:83)
                                     at

                              org.apache.cassandra.net.TcpConnectionManager.getConnection
(TcpConnectionManager.java:64)
                                     at

                              org.apache.cassandra.net.MessagingService.getConnection
(MessagingService.java:306)
                                     at

                              org.apache.cassandra.net.MessageSerializationTask.run
(MessageSerializationTask.java:66)
                                     at

                              java.util.concurrent.ThreadPoolExecutor
                              $Worker.runTask(ThreadPoolExecutor.java:886)
                                     at

                              java.util.concurrent.ThreadPoolExecutor
                              $Worker.run(ThreadPoolExecutor.java:908)
                                     at java.lang.Thread.run
                              (Thread.java:619)
                              DEBUG [RESPONSE-STAGE:4] 2009-08-19
                              16:54:57,622 ResponseVerbHandler.java
                              (line 38) Processing response on a callback
                              from
                              65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129
                              :7000
                              ERROR [pool-1-thread-22] 2009-08-19
                              16:55:02,619 Cassandra.java (line
                              608)
                              Internal error processing get
                              java.lang.RuntimeException:
                              java.util.concurrent.TimeoutException:
                              Operation
                              timed out - received only 1 responses from
                              192.168.133.129:7000 .
                                     at

                              org.apache.cassandra.service.CassandraServer.readColumnFamily
(CassandraServer.java:100)
                                     at

                              org.apache.cassandra.service.CassandraServer.get
(CassandraServer.java:226)
                                     at

                              org.apache.cassandra.service.Cassandra
                              $Processor$get.process(Cassandra.java:602)
                                     at

                              org.apache.cassandra.service.Cassandra
                              $Processor.process(Cassandra.java:560)
                                     at

                              org.apache.thrift.server.TThreadPoolServer
                              $WorkerProcess.run
                              (TThreadPoolServer.java:252)
                                     at

                              java.util.concurrent.ThreadPoolExecutor
                              $Worker.runTask(ThreadPoolExecutor.java:886)
                                     at

                              java.util.concurrent.ThreadPoolExecutor
                              $Worker.run(ThreadPoolExecutor.java:908)
                                     at java.lang.Thread.run
                              (Thread.java:619)
                              Caused by:
                              java.util.concurrent.TimeoutException:
                              Operation timed out -
                              received only 1 responses from
                              192.168.133.129:7000 .
                                     at

                              org.apache.cassandra.service.QuorumResponseHandler.get
(QuorumResponseHandler.java:86)
                                     at

                              org.apache.cassandra.service.StorageProxy.strongRead
(StorageProxy.java:435)
                                     at

                              org.apache.cassandra.service.StorageProxy.readProtocol
(StorageProxy.java:330)
                                     at

                              org.apache.cassandra.service.CassandraServer.readColumnFamily
(CassandraServer.java:92)
                                     ... 7 more

                              It appears to me that there is a timeout
                              during the QuorumResponseHandler
                              processing, stemming from a
                              NullPointerException that occurs as part of
                              the
                              read process. I suspect that this
                              NullPointerException has something to
                              do
                              with the second DEBUG [pool-1-thread-22]
                              comment regarding strongread ...
                              from 38184@null.
                              Does anyone know why this might be happening?
                              Thanks for any insight,
                              Phil






Mime
View raw message