incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramzi Rabah <rra...@playdom.com>
Subject Re: Encountering timeout exception when running get_key_range
Date Tue, 20 Oct 2009 03:28:52 GMT
Hi Jonathan the data is about 60 MB. Would you like me to send it to you?


On Mon, Oct 19, 2009 at 8:20 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> Is the data on 6, 9, or 10 small enough that you could tar.gz it up
> for me to use to reproduce over here?
>
> On Mon, Oct 19, 2009 at 10:17 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>> So my cluster has 4 nodes node6, node8, node9 and node10. I turned
>> them all off.
>> 1- I started node6 by itself and still got the problem.
>> 2- I started node8 by itself and it ran fine (returned no keys)
>> 3- I started node9 by itself and still got the problem.
>> 4- I started node10 by itself and still got the problem.
>>
>> Ray
>>
>>
>>
>> On Mon, Oct 19, 2009 at 7:44 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>> That's really strange...  Can you reproduce on a single-node cluster?
>>>
>>> On Mon, Oct 19, 2009 at 9:34 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>>>> The rows are very small. There are a handful of columns per row
>>>> (approximately about 4-5 columns per row).
>>>> Each column has a name which is a String (20-30 characters long), and
>>>> the value is an empty array of bytes (new byte[0]).
>>>> I just use the names of the columns, and don't need to store any
>>>> values in this Column Family.
>>>>
>>>> -- Ray
>>>>
>>>> On Mon, Oct 19, 2009 at 7:24 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>> Can you tell me anything about the nature of your rows?  Many/few
>>>>> columns?  Large/small column values?
>>>>>
>>>>> On Mon, Oct 19, 2009 at 9:17 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>> Hi Jonathan
>>>>>> I actually spoke too early. Now even if I restart the servers it
still
>>>>>> gives a timeout exception.
>>>>>> As far as the sstable files are, not sure which ones are the sstables,
>>>>>> but here is the list of files in the data directory that are prepended
>>>>>> with the column family name:
>>>>>> DatastoreDeletionSchedule-1-Data.db
>>>>>> DatastoreDeletionSchedule-1-Filter.db
>>>>>> DatastoreDeletionSchedule-1-Index.db
>>>>>> DatastoreDeletionSchedule-5-Data.db
>>>>>> DatastoreDeletionSchedule-5-Filter.db
>>>>>> DatastoreDeletionSchedule-5-Index.db
>>>>>> DatastoreDeletionSchedule-7-Data.db
>>>>>> DatastoreDeletionSchedule-7-Filter.db
>>>>>> DatastoreDeletionSchedule-7-Index.db
>>>>>> DatastoreDeletionSchedule-8-Data.db
>>>>>> DatastoreDeletionSchedule-8-Filter.db
>>>>>> DatastoreDeletionSchedule-8-Index.db
>>>>>>
>>>>>> I am not currently doing any system stat collection.
>>>>>>
>>>>>> On Mon, Oct 19, 2009 at 6:41 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>>> How many sstable files are in the data directories for the
>>>>>>> columnfamily you are querying?
>>>>>>>
>>>>>>> How many are there after you restart and it is happy?
>>>>>>>
>>>>>>> Are you doing system stat collection with munin or ganglia or
some such?
>>>>>>>
>>>>>>> On Mon, Oct 19, 2009 at 8:25 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>>>> Hi Jonathan I updated to 4.1 and I still get the same exception
when I
>>>>>>>> call get_key_range.
>>>>>>>> I checked all the server logs, and there is only one exception
being
>>>>>>>> thrown by whichever server I am connecting to.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Ray
>>>>>>>>
>>>>>>>> On Mon, Oct 19, 2009 at 4:52 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>>>>> No, it's smart enough to avoid scanning.
>>>>>>>>>
>>>>>>>>> On Mon, Oct 19, 2009 at 6:49 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>>>>>> Hi Jonathan thanks for the reply, I will update the
code to 0.4.1 and
>>>>>>>>>> will check all the logs on all the machines.
>>>>>>>>>> Just a simple question, when you do a get_key_range
and you specify ""
>>>>>>>>>> and "" for start and end, and the limit is 25, if
there are too many
>>>>>>>>>> entries, does it do a scan to find out the start
or is it smart enough
>>>>>>>>>> to know what the start key is?
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 19, 2009 at 4:42 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>>>>>>> You should check the other nodes for potential
exceptions keeping them
>>>>>>>>>>> from replying.
>>>>>>>>>>>
>>>>>>>>>>> Without seeing that it's hard to say if this
is caused by an old bug,
>>>>>>>>>>> but you should definitely upgrade to 0.4.1 either
way :)
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 19, 2009 at 5:51 PM, Ramzi Rabah
<rrabah@playdom.com> wrote:
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> I am running into problems with get_key_range.
I have
>>>>>>>>>>>> OrderPreservingPartitioner defined in storage-conf.xml
and I am using
>>>>>>>>>>>> a columnfamily that looks like
>>>>>>>>>>>>     <ColumnFamily CompareWith="BytesType"
>>>>>>>>>>>>                   Name="DatastoreDeletionSchedule"
>>>>>>>>>>>>                   />
>>>>>>>>>>>>
>>>>>>>>>>>> My command is client.get_key_range("Keyspace1",
"DatastoreDeletionSchedule",
>>>>>>>>>>>>                    "", "", 25,
ConsistencyLevel.ONE);
>>>>>>>>>>>>
>>>>>>>>>>>> It usually works fine but after a day or
so from server writes into
>>>>>>>>>>>> this column family, I started getting
>>>>>>>>>>>> ERROR [pool-1-thread-36] 2009-10-19 17:24:28,223
Cassandra.java (line
>>>>>>>>>>>> 770) Internal error processing get_key_range
>>>>>>>>>>>> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
>>>>>>>>>>>> Operation timed out.
>>>>>>>>>>>>        at org.apache.cassandra.service.StorageProxy.getKeyRange(StorageProxy.java:560)
>>>>>>>>>>>>        at org.apache.cassandra.service.CassandraServer.get_key_range(CassandraServer.java:595)
>>>>>>>>>>>>        at org.apache.cassandra.service.Cassandra$Processor$get_key_range.process(Cassandra.java:766)
>>>>>>>>>>>>        at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:609)
>>>>>>>>>>>>        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>>>>>>>>>>>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>>>>>>>>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>>>>>>>>>        at java.lang.Thread.run(Thread.java:619)
>>>>>>>>>>>> Caused by: java.util.concurrent.TimeoutException:
Operation timed out.
>>>>>>>>>>>>        at org.apache.cassandra.net.AsyncResult.get(AsyncResult.java:97)
>>>>>>>>>>>>        at org.apache.cassandra.service.StorageProxy.getKeyRange(StorageProxy.java:556)
>>>>>>>>>>>>        ... 7 more
>>>>>>>>>>>>
>>>>>>>>>>>> I still get the timeout exceptions even though
the servers have been
>>>>>>>>>>>> idle for 2 days. When I restart the cassandra
servers, it seems to
>>>>>>>>>>>> work fine again. Any ideas what could be
wrong?
>>>>>>>>>>>>
>>>>>>>>>>>> By the way, I am using version:apache-cassandra-incubating-0.4.0-rc2
>>>>>>>>>>>> Not sure if this is fixed in the 0.4.1 version
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Ray
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message