incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Encountering timeout exception when running get_key_range
Date Tue, 20 Oct 2009 02:44:44 GMT
That's really strange...  Can you reproduce on a single-node cluster?

On Mon, Oct 19, 2009 at 9:34 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
> The rows are very small. There are a handful of columns per row
> (approximately about 4-5 columns per row).
> Each column has a name which is a String (20-30 characters long), and
> the value is an empty array of bytes (new byte[0]).
> I just use the names of the columns, and don't need to store any
> values in this Column Family.
>
> -- Ray
>
> On Mon, Oct 19, 2009 at 7:24 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> Can you tell me anything about the nature of your rows?  Many/few
>> columns?  Large/small column values?
>>
>> On Mon, Oct 19, 2009 at 9:17 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>>> Hi Jonathan
>>> I actually spoke too early. Now even if I restart the servers it still
>>> gives a timeout exception.
>>> As far as the sstable files are, not sure which ones are the sstables,
>>> but here is the list of files in the data directory that are prepended
>>> with the column family name:
>>> DatastoreDeletionSchedule-1-Data.db
>>> DatastoreDeletionSchedule-1-Filter.db
>>> DatastoreDeletionSchedule-1-Index.db
>>> DatastoreDeletionSchedule-5-Data.db
>>> DatastoreDeletionSchedule-5-Filter.db
>>> DatastoreDeletionSchedule-5-Index.db
>>> DatastoreDeletionSchedule-7-Data.db
>>> DatastoreDeletionSchedule-7-Filter.db
>>> DatastoreDeletionSchedule-7-Index.db
>>> DatastoreDeletionSchedule-8-Data.db
>>> DatastoreDeletionSchedule-8-Filter.db
>>> DatastoreDeletionSchedule-8-Index.db
>>>
>>> I am not currently doing any system stat collection.
>>>
>>> On Mon, Oct 19, 2009 at 6:41 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>>> How many sstable files are in the data directories for the
>>>> columnfamily you are querying?
>>>>
>>>> How many are there after you restart and it is happy?
>>>>
>>>> Are you doing system stat collection with munin or ganglia or some such?
>>>>
>>>> On Mon, Oct 19, 2009 at 8:25 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>>>>> Hi Jonathan I updated to 4.1 and I still get the same exception when
I
>>>>> call get_key_range.
>>>>> I checked all the server logs, and there is only one exception being
>>>>> thrown by whichever server I am connecting to.
>>>>>
>>>>> Thanks
>>>>> Ray
>>>>>
>>>>> On Mon, Oct 19, 2009 at 4:52 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>> No, it's smart enough to avoid scanning.
>>>>>>
>>>>>> On Mon, Oct 19, 2009 at 6:49 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>>> Hi Jonathan thanks for the reply, I will update the code to 0.4.1
and
>>>>>>> will check all the logs on all the machines.
>>>>>>> Just a simple question, when you do a get_key_range and you specify
""
>>>>>>> and "" for start and end, and the limit is 25, if there are too
many
>>>>>>> entries, does it do a scan to find out the start or is it smart
enough
>>>>>>> to know what the start key is?
>>>>>>>
>>>>>>> On Mon, Oct 19, 2009 at 4:42 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>>>> You should check the other nodes for potential exceptions
keeping them
>>>>>>>> from replying.
>>>>>>>>
>>>>>>>> Without seeing that it's hard to say if this is caused by
an old bug,
>>>>>>>> but you should definitely upgrade to 0.4.1 either way :)
>>>>>>>>
>>>>>>>> On Mon, Oct 19, 2009 at 5:51 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I am running into problems with get_key_range. I have
>>>>>>>>> OrderPreservingPartitioner defined in storage-conf.xml
and I am using
>>>>>>>>> a columnfamily that looks like
>>>>>>>>>     <ColumnFamily CompareWith="BytesType"
>>>>>>>>>                   Name="DatastoreDeletionSchedule"
>>>>>>>>>                   />
>>>>>>>>>
>>>>>>>>> My command is client.get_key_range("Keyspace1", "DatastoreDeletionSchedule",
>>>>>>>>>                    "", "", 25, ConsistencyLevel.ONE);
>>>>>>>>>
>>>>>>>>> It usually works fine but after a day or so from server
writes into
>>>>>>>>> this column family, I started getting
>>>>>>>>> ERROR [pool-1-thread-36] 2009-10-19 17:24:28,223 Cassandra.java
(line
>>>>>>>>> 770) Internal error processing get_key_range
>>>>>>>>> java.lang.RuntimeException: java.util.concurrent.TimeoutException:
>>>>>>>>> Operation timed out.
>>>>>>>>>        at org.apache.cassandra.service.StorageProxy.getKeyRange(StorageProxy.java:560)
>>>>>>>>>        at org.apache.cassandra.service.CassandraServer.get_key_range(CassandraServer.java:595)
>>>>>>>>>        at org.apache.cassandra.service.Cassandra$Processor$get_key_range.process(Cassandra.java:766)
>>>>>>>>>        at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:609)
>>>>>>>>>        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>>>>>>>>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>>>>>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>>>>>>        at java.lang.Thread.run(Thread.java:619)
>>>>>>>>> Caused by: java.util.concurrent.TimeoutException: Operation
timed out.
>>>>>>>>>        at org.apache.cassandra.net.AsyncResult.get(AsyncResult.java:97)
>>>>>>>>>        at org.apache.cassandra.service.StorageProxy.getKeyRange(StorageProxy.java:556)
>>>>>>>>>        ... 7 more
>>>>>>>>>
>>>>>>>>> I still get the timeout exceptions even though the servers
have been
>>>>>>>>> idle for 2 days. When I restart the cassandra servers,
it seems to
>>>>>>>>> work fine again. Any ideas what could be wrong?
>>>>>>>>>
>>>>>>>>> By the way, I am using version:apache-cassandra-incubating-0.4.0-rc2
>>>>>>>>> Not sure if this is fixed in the 0.4.1 version
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Ray
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message