incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramzi Rabah <rra...@playdom.com>
Subject Re: Encountering timeout exception when running get_key_range
Date Wed, 21 Oct 2009 19:30:03 GMT
I opened https://issues.apache.org/jira/browse/CASSANDRA-507

Ray

On Wed, Oct 21, 2009 at 12:07 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> The compaction code removes tombstones, and it runs whenever you have
> enough sstable fragments.
>
> I think I know what is happening -- as an optimization, if there is
> only one version of a row it will just copy it to the new sstable.
> This means it won't clean out tombstones.
>
> Can you file a bug at https://issues.apache.org/jira/browse/CASSANDRA ?
>
> -Jonathan
>
> On Wed, Oct 21, 2009 at 2:01 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>> Hi Jonathan I am still running into the timeout issue even after
>> reducing the GCGraceSeconds to 1 hour (we have tons of deletes
>> happening in our app). Which part of Cassandra
>> is responsible for deleting the tombstone records and how often does it run.
>>
>>
>> On Tue, Oct 20, 2009 at 12:02 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>>> Thank you so much Jonathan.
>>>
>>> Data is test data so I'll just wipe it out and restart after updating
>>> GCGraceSeconds.
>>> Thanks for your help.
>>>
>>> Ray
>>>
>>> On Tue, Oct 20, 2009 at 11:39 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>>> The problem is you have a few MB of actual data and a few hundred MB
>>>> of tombstones (data marked deleted).  So what happens is get_key_range
>>>> spends a long, long time iterating through the tombstoned rows,
>>>> looking for keys that actually still exist.
>>>>
>>>> We're going to redesign this for CASSANDRA-344, but for the 0.4
>>>> series, you should restart with GCGraceSeconds much lower (e.g. 3600),
>>>> delete your old data files, and reload your data fresh.  (Instead of
>>>> reloading, you can use "nodeprobe compact" on each node to force a
>>>> major compaction but it will take much longer since you have so many
>>>> tombstones).
>>>>
>>>> -Jonathan
>>>>
>>>> On Mon, Oct 19, 2009 at 10:45 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>> Hi Jonathan:
>>>>>
>>>>> Here is the storage_conf.xml for one of the servers
>>>>> http://email.slicezero.com/storage-conf.xml
>>>>>
>>>>> and here is the zipped data:
>>>>> http://email.slicezero.com/datastoreDeletion.tgz
>>>>>
>>>>> Thanks
>>>>> Ray
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 19, 2009 at 8:30 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>> Yes, please.  You'll probably have to use something like
>>>>>> http://www.getdropbox.com/ if you don't have a public web server
to
>>>>>> stash it temporarily.
>>>>>>
>>>>>> On Mon, Oct 19, 2009 at 10:28 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>>> Hi Jonathan the data is about 60 MB. Would you like me to send
it to you?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 19, 2009 at 8:20 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>>>> Is the data on 6, 9, or 10 small enough that you could tar.gz
it up
>>>>>>>> for me to use to reproduce over here?
>>>>>>>>
>>>>>>>> On Mon, Oct 19, 2009 at 10:17 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>>>>> So my cluster has 4 nodes node6, node8, node9 and node10.
I turned
>>>>>>>>> them all off.
>>>>>>>>> 1- I started node6 by itself and still got the problem.
>>>>>>>>> 2- I started node8 by itself and it ran fine (returned
no keys)
>>>>>>>>> 3- I started node9 by itself and still got the problem.
>>>>>>>>> 4- I started node10 by itself and still got the problem.
>>>>>>>>>
>>>>>>>>> Ray
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 19, 2009 at 7:44 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>>>>>> That's really strange...  Can you reproduce on a
single-node cluster?
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 19, 2009 at 9:34 PM, Ramzi Rabah <rrabah@playdom.com>
wrote:
>>>>>>>>>>> The rows are very small. There are a handful
of columns per row
>>>>>>>>>>> (approximately about 4-5 columns per row).
>>>>>>>>>>> Each column has a name which is a String (20-30
characters long), and
>>>>>>>>>>> the value is an empty array of bytes (new byte[0]).
>>>>>>>>>>> I just use the names of the columns, and don't
need to store any
>>>>>>>>>>> values in this Column Family.
>>>>>>>>>>>
>>>>>>>>>>> -- Ray
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 19, 2009 at 7:24 PM, Jonathan Ellis
<jbellis@gmail.com> wrote:
>>>>>>>>>>>> Can you tell me anything about the nature
of your rows?  Many/few
>>>>>>>>>>>> columns?  Large/small column values?
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Oct 19, 2009 at 9:17 PM, Ramzi Rabah
<rrabah@playdom.com> wrote:
>>>>>>>>>>>>> Hi Jonathan
>>>>>>>>>>>>> I actually spoke too early. Now even
if I restart the servers it still
>>>>>>>>>>>>> gives a timeout exception.
>>>>>>>>>>>>> As far as the sstable files are, not
sure which ones are the sstables,
>>>>>>>>>>>>> but here is the list of files in the
data directory that are prepended
>>>>>>>>>>>>> with the column family name:
>>>>>>>>>>>>> DatastoreDeletionSchedule-1-Data.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-1-Filter.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-1-Index.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-5-Data.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-5-Filter.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-5-Index.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-7-Data.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-7-Filter.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-7-Index.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-8-Data.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-8-Filter.db
>>>>>>>>>>>>> DatastoreDeletionSchedule-8-Index.db
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not currently doing any system stat
collection.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 19, 2009 at 6:41 PM, Jonathan
Ellis <jbellis@gmail.com> wrote:
>>>>>>>>>>>>>> How many sstable files are in the
data directories for the
>>>>>>>>>>>>>> columnfamily you are querying?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How many are there after you restart
and it is happy?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Are you doing system stat collection
with munin or ganglia or some such?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Oct 19, 2009 at 8:25 PM,
Ramzi Rabah <rrabah@playdom.com> wrote:
>>>>>>>>>>>>>>> Hi Jonathan I updated to 4.1
and I still get the same exception when I
>>>>>>>>>>>>>>> call get_key_range.
>>>>>>>>>>>>>>> I checked all the server logs,
and there is only one exception being
>>>>>>>>>>>>>>> thrown by whichever server I
am connecting to.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> Ray
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Oct 19, 2009 at 4:52
PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>>>>>>>>>>>>>>> No, it's smart enough to
avoid scanning.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Oct 19, 2009 at 6:49
PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>>>>>>>>>>>>>>>>> Hi Jonathan thanks for
the reply, I will update the code to 0.4.1 and
>>>>>>>>>>>>>>>>> will check all the logs
on all the machines.
>>>>>>>>>>>>>>>>> Just a simple question,
when you do a get_key_range and you specify ""
>>>>>>>>>>>>>>>>> and "" for start and
end, and the limit is 25, if there are too many
>>>>>>>>>>>>>>>>> entries, does it do a
scan to find out the start or is it smart enough
>>>>>>>>>>>>>>>>> to know what the start
key is?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Oct 19, 2009
at 4:42 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>>>>>>>>>>>>>>>>> You should check
the other nodes for potential exceptions keeping them
>>>>>>>>>>>>>>>>>> from replying.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Without seeing that
it's hard to say if this is caused by an old bug,
>>>>>>>>>>>>>>>>>> but you should definitely
upgrade to 0.4.1 either way :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Oct 19, 2009
at 5:51 PM, Ramzi Rabah <rrabah@playdom.com> wrote:
>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am running
into problems with get_key_range. I have
>>>>>>>>>>>>>>>>>>> OrderPreservingPartitioner
defined in storage-conf.xml and I am using
>>>>>>>>>>>>>>>>>>> a columnfamily
that looks like
>>>>>>>>>>>>>>>>>>>     <ColumnFamily
CompareWith="BytesType"
>>>>>>>>>>>>>>>>>>>          
        Name="DatastoreDeletionSchedule"
>>>>>>>>>>>>>>>>>>>          
        />
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> My command is
client.get_key_range("Keyspace1", "DatastoreDeletionSchedule",
>>>>>>>>>>>>>>>>>>>          
         "", "", 25, ConsistencyLevel.ONE);
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It usually works
fine but after a day or so from server writes into
>>>>>>>>>>>>>>>>>>> this column family,
I started getting
>>>>>>>>>>>>>>>>>>> ERROR [pool-1-thread-36]
2009-10-19 17:24:28,223 Cassandra.java (line
>>>>>>>>>>>>>>>>>>> 770) Internal
error processing get_key_range
>>>>>>>>>>>>>>>>>>> java.lang.RuntimeException:
java.util.concurrent.TimeoutException:
>>>>>>>>>>>>>>>>>>> Operation timed
out.
>>>>>>>>>>>>>>>>>>>        at
org.apache.cassandra.service.StorageProxy.getKeyRange(StorageProxy.java:560)
>>>>>>>>>>>>>>>>>>>        at
org.apache.cassandra.service.CassandraServer.get_key_range(CassandraServer.java:595)
>>>>>>>>>>>>>>>>>>>        at
org.apache.cassandra.service.Cassandra$Processor$get_key_range.process(Cassandra.java:766)
>>>>>>>>>>>>>>>>>>>        at
org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:609)
>>>>>>>>>>>>>>>>>>>        at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>>>>>>>>>>>>>>>>>>>        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>>>>>>>>>>>>>>>>        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>>>>>>>>>>>>>>>>        at
java.lang.Thread.run(Thread.java:619)
>>>>>>>>>>>>>>>>>>> Caused by: java.util.concurrent.TimeoutException:
Operation timed out.
>>>>>>>>>>>>>>>>>>>        at
org.apache.cassandra.net.AsyncResult.get(AsyncResult.java:97)
>>>>>>>>>>>>>>>>>>>        at
org.apache.cassandra.service.StorageProxy.getKeyRange(StorageProxy.java:556)
>>>>>>>>>>>>>>>>>>>        ...
7 more
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I still get the
timeout exceptions even though the servers have been
>>>>>>>>>>>>>>>>>>> idle for 2 days.
When I restart the cassandra servers, it seems to
>>>>>>>>>>>>>>>>>>> work fine again.
Any ideas what could be wrong?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> By the way, I
am using version:apache-cassandra-incubating-0.4.0-rc2
>>>>>>>>>>>>>>>>>>> Not sure if this
is fixed in the 0.4.1 version
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>> Ray
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message