cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-156) error reading key until first use of the HTTP interface
Date Fri, 08 May 2009 17:20:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707428#action_12707428
] 

Jonathan Ellis commented on CASSANDRA-156:
------------------------------------------

in the future, please add thread dumps as attachments, they're a bit long :)

this part is the key:

"MESSAGE-SERIALIZER-POOL:1" prio=10 tid=0x00002aaafc06a400 nid=0x3387 waiting for monitor
entry [0x000000004268d000..0x000000004268dd00]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:115)
        - waiting to lock <0x00002aaab2397368> (a java.util.Collections$UnmodifiableSet)
        at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:180)
        - locked <0x00002aaab26c39c0> (a java.lang.Object)
        at org.apache.cassandra.net.SelectorManager.register(SelectorManager.java:79)
        at org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:91)
        at org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
        at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:313)
        at org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

that says that register() is blocking on a lock held by the SelectorManager.select():

                selector.select(100);

this is a bug in the jdk or your os.  (I'm not sure how to narrow it down further.)  The semantics
of select(100) are,

     * @param  timeout  If positive, block for up to <tt>timeout</tt>
     *                  milliseconds, more or less, while waiting for a
     *                  channel to become ready; if zero, block indefinitely;
     *                  must not be negative

so each 100ms register() calls should be able to go through but here you are getting stuck
indefinitely anyway.

we ran into this in CASSANDRA-97 too, there we were able to re-order things so that all the
register()s happen before the first select() call but in this case that doesn't seem possible.

Try the attached patch and see if that helps.  If it doesn't, try changing select(100) to
select().  (still post-patch-apply.)

> error reading key until first use of the HTTP interface
> -------------------------------------------------------
>
>                 Key: CASSANDRA-156
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-156
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: trunk
>         Environment: java version "1.6.0_13" Linux tst04o 2.6.18-128.1.6.el5 #1 SMP Wed
Apr 1 09:10:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Mark Robson
>            Priority: Minor
>         Attachments: 156.patch
>
>
> After startup, but before the first access to the HTTP interface, thrift command get_slice
returns the following error:
> ./Cassandra-remote -h tst04o:9160 get_slice Messages 305 base 0 1
> Traceback (most recent call last):
>   File "./Cassandra-remote", line 96, in ?
>     pp.pprint(client.get_slice(args[0],args[1],args[2],eval(args[3]),eval(args[4]),))
>   File "/opt/mailcontrol/gen-py/org/apache/cassandra/Cassandra.py", line 213, in get_slice
>     return self.recv_get_slice()
>   File "/opt/mailcontrol/gen-py/org/apache/cassandra/Cassandra.py", line 233, in recv_get_slice
>     raise x
> thrift.Thrift.TApplicationException: Internal error processing get_slice
> Error message on the log file:
> ERROR [pool-1-thread-1] 2009-05-08 14:49:36,977 Cassandra.java (line 823) Internal error
processing get_slice
> java.lang.RuntimeException: error reading key 305
>         at org.apache.cassandra.service.StorageProxy.weakReadRemote(StorageProxy.java:256)
>         at org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:363)
>         at org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:112)
>         at org.apache.cassandra.service.CassandraServer.get_slice(CassandraServer.java:191)
>         at org.apache.cassandra.service.Cassandra$Processor$get_slice.process(Cassandra.java:817)
>         at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:805)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException: Operation timed out.
>         at org.apache.cassandra.net.AsyncResult.get(AsyncResult.java:95)
>         at org.apache.cassandra.service.StorageProxy.weakReadRemote(StorageProxy.java:252)
>         ... 9 more
> After first access to the HTTP interface, the get_slice method now succeeds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message