Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 71631 invoked from network); 20 Oct 2009 19:03:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Oct 2009 19:03:06 -0000 Received: (qmail 18937 invoked by uid 500); 20 Oct 2009 19:03:05 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 18922 invoked by uid 500); 20 Oct 2009 19:03:05 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 18913 invoked by uid 99); 20 Oct 2009 19:03:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Oct 2009 19:03:05 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00 X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [74.125.149.205] (HELO na3sys009aog111.obsmtp.com) (74.125.149.205) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 20 Oct 2009 19:03:02 +0000 Received: from source ([209.85.216.180]) by na3sys009aob111.postini.com ([74.125.148.12]) with SMTP ID DSNKSt4JUNw7/HmJG6szpT9J/RKvJjqyu+Cf@postini.com; Tue, 20 Oct 2009 12:02:42 PDT Received: by pxi10 with SMTP id 10so3626415pxi.0 for ; Tue, 20 Oct 2009 12:02:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.114.18.37 with SMTP id 37mr9938780war.143.1256065360515; Tue, 20 Oct 2009 12:02:40 -0700 (PDT) In-Reply-To: References: Date: Tue, 20 Oct 2009 12:02:40 -0700 Message-ID: Subject: Re: Encountering timeout exception when running get_key_range From: Ramzi Rabah To: cassandra-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thank you so much Jonathan. Data is test data so I'll just wipe it out and restart after updating GCGraceSeconds. Thanks for your help. Ray On Tue, Oct 20, 2009 at 11:39 AM, Jonathan Ellis wrote: > The problem is you have a few MB of actual data and a few hundred MB > of tombstones (data marked deleted). =A0So what happens is get_key_range > spends a long, long time iterating through the tombstoned rows, > looking for keys that actually still exist. > > We're going to redesign this for CASSANDRA-344, but for the 0.4 > series, you should restart with GCGraceSeconds much lower (e.g. 3600), > delete your old data files, and reload your data fresh. =A0(Instead of > reloading, you can use "nodeprobe compact" on each node to force a > major compaction but it will take much longer since you have so many > tombstones). > > -Jonathan > > On Mon, Oct 19, 2009 at 10:45 PM, Ramzi Rabah wrote: >> Hi Jonathan: >> >> Here is the storage_conf.xml for one of the servers >> http://email.slicezero.com/storage-conf.xml >> >> and here is the zipped data: >> http://email.slicezero.com/datastoreDeletion.tgz >> >> Thanks >> Ray >> >> >> >> >> On Mon, Oct 19, 2009 at 8:30 PM, Jonathan Ellis wrot= e: >>> Yes, please. =A0You'll probably have to use something like >>> http://www.getdropbox.com/ if you don't have a public web server to >>> stash it temporarily. >>> >>> On Mon, Oct 19, 2009 at 10:28 PM, Ramzi Rabah wrot= e: >>>> Hi Jonathan the data is about 60 MB. Would you like me to send it to y= ou? >>>> >>>> >>>> On Mon, Oct 19, 2009 at 8:20 PM, Jonathan Ellis wr= ote: >>>>> Is the data on 6, 9, or 10 small enough that you could tar.gz it up >>>>> for me to use to reproduce over here? >>>>> >>>>> On Mon, Oct 19, 2009 at 10:17 PM, Ramzi Rabah wr= ote: >>>>>> So my cluster has 4 nodes node6, node8, node9 and node10. I turned >>>>>> them all off. >>>>>> 1- I started node6 by itself and still got the problem. >>>>>> 2- I started node8 by itself and it ran fine (returned no keys) >>>>>> 3- I started node9 by itself and still got the problem. >>>>>> 4- I started node10 by itself and still got the problem. >>>>>> >>>>>> Ray >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Oct 19, 2009 at 7:44 PM, Jonathan Ellis = wrote: >>>>>>> That's really strange... =A0Can you reproduce on a single-node clus= ter? >>>>>>> >>>>>>> On Mon, Oct 19, 2009 at 9:34 PM, Ramzi Rabah w= rote: >>>>>>>> The rows are very small. There are a handful of columns per row >>>>>>>> (approximately about 4-5 columns per row). >>>>>>>> Each column has a name which is a String (20-30 characters long), = and >>>>>>>> the value is an empty array of bytes (new byte[0]). >>>>>>>> I just use the names of the columns, and don't need to store any >>>>>>>> values in this Column Family. >>>>>>>> >>>>>>>> -- Ray >>>>>>>> >>>>>>>> On Mon, Oct 19, 2009 at 7:24 PM, Jonathan Ellis wrote: >>>>>>>>> Can you tell me anything about the nature of your rows? =A0Many/f= ew >>>>>>>>> columns? =A0Large/small column values? >>>>>>>>> >>>>>>>>> On Mon, Oct 19, 2009 at 9:17 PM, Ramzi Rabah = wrote: >>>>>>>>>> Hi Jonathan >>>>>>>>>> I actually spoke too early. Now even if I restart the servers it= still >>>>>>>>>> gives a timeout exception. >>>>>>>>>> As far as the sstable files are, not sure which ones are the sst= ables, >>>>>>>>>> but here is the list of files in the data directory that are pre= pended >>>>>>>>>> with the column family name: >>>>>>>>>> DatastoreDeletionSchedule-1-Data.db >>>>>>>>>> DatastoreDeletionSchedule-1-Filter.db >>>>>>>>>> DatastoreDeletionSchedule-1-Index.db >>>>>>>>>> DatastoreDeletionSchedule-5-Data.db >>>>>>>>>> DatastoreDeletionSchedule-5-Filter.db >>>>>>>>>> DatastoreDeletionSchedule-5-Index.db >>>>>>>>>> DatastoreDeletionSchedule-7-Data.db >>>>>>>>>> DatastoreDeletionSchedule-7-Filter.db >>>>>>>>>> DatastoreDeletionSchedule-7-Index.db >>>>>>>>>> DatastoreDeletionSchedule-8-Data.db >>>>>>>>>> DatastoreDeletionSchedule-8-Filter.db >>>>>>>>>> DatastoreDeletionSchedule-8-Index.db >>>>>>>>>> >>>>>>>>>> I am not currently doing any system stat collection. >>>>>>>>>> >>>>>>>>>> On Mon, Oct 19, 2009 at 6:41 PM, Jonathan Ellis wrote: >>>>>>>>>>> How many sstable files are in the data directories for the >>>>>>>>>>> columnfamily you are querying? >>>>>>>>>>> >>>>>>>>>>> How many are there after you restart and it is happy? >>>>>>>>>>> >>>>>>>>>>> Are you doing system stat collection with munin or ganglia or s= ome such? >>>>>>>>>>> >>>>>>>>>>> On Mon, Oct 19, 2009 at 8:25 PM, Ramzi Rabah wrote: >>>>>>>>>>>> Hi Jonathan I updated to 4.1 and I still get the same exceptio= n when I >>>>>>>>>>>> call get_key_range. >>>>>>>>>>>> I checked all the server logs, and there is only one exception= being >>>>>>>>>>>> thrown by whichever server I am connecting to. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Ray >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Oct 19, 2009 at 4:52 PM, Jonathan Ellis wrote: >>>>>>>>>>>>> No, it's smart enough to avoid scanning. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Oct 19, 2009 at 6:49 PM, Ramzi Rabah wrote: >>>>>>>>>>>>>> Hi Jonathan thanks for the reply, I will update the code to = 0.4.1 and >>>>>>>>>>>>>> will check all the logs on all the machines. >>>>>>>>>>>>>> Just a simple question, when you do a get_key_range and you = specify "" >>>>>>>>>>>>>> and "" for start and end, and the limit is 25, if there are = too many >>>>>>>>>>>>>> entries, does it do a scan to find out the start or is it sm= art enough >>>>>>>>>>>>>> to know what the start key is? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Oct 19, 2009 at 4:42 PM, Jonathan Ellis wrote: >>>>>>>>>>>>>>> You should check the other nodes for potential exceptions k= eeping them >>>>>>>>>>>>>>> from replying. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Without seeing that it's hard to say if this is caused by a= n old bug, >>>>>>>>>>>>>>> but you should definitely upgrade to 0.4.1 either way :) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Oct 19, 2009 at 5:51 PM, Ramzi Rabah wrote: >>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am running into problems with get_key_range. I have >>>>>>>>>>>>>>>> OrderPreservingPartitioner defined in storage-conf.xml and= I am using >>>>>>>>>>>>>>>> a columnfamily that looks like >>>>>>>>>>>>>>>> =A0 =A0 >>>>>>>>>>>>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Name=3D"DatastoreDelet= ionSchedule" >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> My command is client.get_key_range("Keyspace1", "Datastore= DeletionSchedule", >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"", "", 25, Consist= encyLevel.ONE); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It usually works fine but after a day or so from server wr= ites into >>>>>>>>>>>>>>>> this column family, I started getting >>>>>>>>>>>>>>>> ERROR [pool-1-thread-36] 2009-10-19 17:24:28,223 Cassandra= .java (line >>>>>>>>>>>>>>>> 770) Internal error processing get_key_range >>>>>>>>>>>>>>>> java.lang.RuntimeException: java.util.concurrent.TimeoutEx= ception: >>>>>>>>>>>>>>>> Operation timed out. >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at org.apache.cassandra.service.StorageProx= y.getKeyRange(StorageProxy.java:560) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at org.apache.cassandra.service.CassandraSe= rver.get_key_range(CassandraServer.java:595) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at org.apache.cassandra.service.Cassandra$P= rocessor$get_key_range.process(Cassandra.java:766) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at org.apache.cassandra.service.Cassandra$P= rocessor.process(Cassandra.java:609) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at org.apache.thrift.server.TThreadPoolServ= er$WorkerProcess.run(TThreadPoolServer.java:253) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at java.util.concurrent.ThreadPoolExecutor$= Worker.runTask(ThreadPoolExecutor.java:885) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at java.util.concurrent.ThreadPoolExecutor$= Worker.run(ThreadPoolExecutor.java:907) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at java.lang.Thread.run(Thread.java:619) >>>>>>>>>>>>>>>> Caused by: java.util.concurrent.TimeoutException: Operatio= n timed out. >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at org.apache.cassandra.net.AsyncResult.get= (AsyncResult.java:97) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0at org.apache.cassandra.service.StorageProx= y.getKeyRange(StorageProxy.java:556) >>>>>>>>>>>>>>>> =A0 =A0 =A0 =A0... 7 more >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I still get the timeout exceptions even though the servers= have been >>>>>>>>>>>>>>>> idle for 2 days. When I restart the cassandra servers, it = seems to >>>>>>>>>>>>>>>> work fine again. Any ideas what could be wrong? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> By the way, I am using version:apache-cassandra-incubating= -0.4.0-rc2 >>>>>>>>>>>>>>>> Not sure if this is fixed in the 0.4.1 version >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>> Ray >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >