cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: get_key_range (CASSANDRA-169)
Date Thu, 10 Sep 2009 21:25:31 GMT
I think I see the problem.

Can you check if your range query is spanning multiple nodes in the
cluster?  You can tell by setting the log level to DEBUG, and looking
for after it logs get_key_range, it will say "reading
RangeCommand(...) from ... @machine" more than once.

The bug is that when picking the node to start the range query it
consults the failure detector to avoid dead nodes, but if the query
spans nodes it does not do that on subsequent nodes.

But if you are only generating one RangeCommand per get_key_range then
we have two bugs. :)


On Wed, Sep 9, 2009 at 6:03 PM, Simon Smith<> wrote:
> I think it might take me quite a bit of effort for me figure out how
> to use a java debugger - it will be a lot quicker if you can give me a
> patch, then I can certainly re-build using ant against either latest
> trunk or latest 0.4 and re-run my test.
> Thanks,
> Simon
> On Wed, Sep 9, 2009 at 6:52 PM, Jonathan Ellis <> wrote:
>> Okay, so when #5 comes back up, #1 eventually stops erroring out and
>> you don't have to restart #1?  That is good, that would have been a
>> bigger problem. :)
>> If you are comfortable using a Java debugger (by default Cassandra
>> listens for one on 8888) you can look at what is going on inside
>> StorageProxy.getKeyRange on node #1 at the call to
>>        EndPoint endPoint =
>> StorageService.instance().findSuitableEndPoint(command.startWith);
>> findSuitableEndpoint is supposed to pick a live node, not a dead one. :)
>> If not I can write a patch to log extra information for this bug so we
>> can track it down.
>> -Jonathan
>> On Wed, Sep 9, 2009 at 5:43 PM, Simon Smith<> wrote:
>>> The error starts as soon as the downed node #5 goes down and lasts
>>> until I restart the downed node #5.
>>> bin/nodeprobe cluster is accurate (it knows quickly when #5 is down,
>>> and when it is up again)
>>> Since I set the replication set to 3, I'm confused as to why (after
>>> the first few seconds or so) there is an error just because one host
>>> is down temporarily.
>>> The way I have the test setup is that I have a script running on each
>>> of the nodes that is running the get_key_range over and over to
>>> "localhost".  Depending on which node I take down, the behavior
>>> varies: if I take done one host, it is the only one giving errors (the
>>> other 4 nodes still work).  For the other 4 situations, either 2 or 3
>>> nodes continue to work (i.e. the downed node and either one or two
>>> other nodes are the ones giving errors).  Note: the nodes that keep
>>> working, never fail at all, not even for a few seconds.
>>> I am running this on 4GB "cloud server" boxes in Rackspace, I can set
>>> up just about any test needed to help debug this and capture output or
>>> logs, and can give a Cassandra developer access if it would help.  Of
>>> course I can include whatever config files or log files would be
>>> helpful, I just don't want to spam the list unless it is relevant.
>>> Thanks again,
>>> Simon
>>> On Tue, Sep 8, 2009 at 6:26 PM, Jonathan Ellis<> wrote:
>>>> getting temporary errors when a node goes down, until the other nodes'
>>>> failure detectors realize it's down, is normal.  (this should only
>>>> take a dozen seconds, or so.)
>>>> but after that it should route requests to other nodes, and it should
>>>> also realize when you restart #5 that it is alive again.  those are
>>>> two separate issues.
>>>> can you verify that "bin/nodeprobe cluster" shows that node 1
>>>> eventually does/does not see #5 dead, and alive again?
>>>> -Jonathan
>>>> On Tue, Sep 8, 2009 at 5:05 PM, Simon Smith<>
>>>>> I'm seeing an issue similar to:
>>>>> Here is when I see it.  I'm running Cassandra on 5 nodes using the
>>>>> OrderPreservingPartitioner, and have populated Cassandra with 78
>>>>> records, and I can use get_key_range via Thrift just fine.  Then, if
>>>>> manually kill one of the nodes (if I kill off node #5), the node (node
>>>>> #1) which I've been using to call get_key_range will timeout and the
>>>>> error:
>>>>>  Thrift: Internal error processing get_key_range
>>>>> And the Cassandra output shows the same trace as in 169:
>>>>> ERROR - Encountered IOException on connection:
>>>>> java.nio.channels.SocketChannel[closed]
>>>>> Connection refused
>>>>>        at Method)
>>>>>        at
>>>>>        at
>>>>>        at
>>>>>        at
>>>>> WARN - Closing down connection java.nio.channels.SocketChannel[closed]
>>>>> ERROR - Internal error processing get_key_range
>>>>> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
>>>>> Operation timed out.
>>>>>        at org.apache.cassandra.service.StorageProxy.getKeyRange(
>>>>>        at org.apache.cassandra.service.CassandraServer.get_key_range(
>>>>>        at org.apache.cassandra.service.Cassandra$Processor$get_key_range.process(
>>>>>        at org.apache.cassandra.service.Cassandra$Processor.process(
>>>>>        at org.apache.thrift.server.TThreadPoolServer$
>>>>>        at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>>>>        at java.util.concurrent.ThreadPoolExecutor$
>>>>>        at
>>>>> Caused by: java.util.concurrent.TimeoutException: Operation timed out.
>>>>>        at
>>>>>        at org.apache.cassandra.service.StorageProxy.getKeyRange(
>>>>>        ... 7 more
>>>>> If it was giving an error just one time, I could just rely on catching
>>>>> the error and trying again.  But a get_key_range call to that node I
>>>>> was already making get_key_range queries against (node #1) never works
>>>>> again (it is still up and it responds fine to multiget Thrift calls),
>>>>> sometimes not even after I restart the down node (node #5).  I end up
>>>>> having to restart node #1 in addition to node #5.  The behavior for
>>>>> the other 3 nodes varies - some of them  are also unable to respond
>>>>> get_key_range calls, but some of them do respond to get_key_range
>>>>> calls.
>>>>> My question is, what path should I go down in terms of reproducing
>>>>> this problem?  I'm using Aug 27 trunk code - should I update my
>>>>> Cassandra install prior to gathering more information for this issue,
>>>>> and if so, which version (0.4 or trunk).  If there is anyone who is
>>>>> familiar with this issue, could you let me know what I might be doing
>>>>> wrong, or what the next info-gathering step should be for me?
>>>>> Thank you,
>>>>> Simon Smith
>>>>> Arcode Corporation

View raw message