From user-return-13219-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu Feb 10 21:37:50 2011 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 2921 invoked from network); 10 Feb 2011 21:37:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Feb 2011 21:37:49 -0000 Received: (qmail 23863 invoked by uid 500); 10 Feb 2011 21:37:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 23807 invoked by uid 500); 10 Feb 2011 21:37:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 23799 invoked by uid 99); 10 Feb 2011 21:37:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Feb 2011 21:37:46 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a78.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Feb 2011 21:37:39 +0000 Received: from homiemail-a78.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a78.g.dreamhost.com (Postfix) with ESMTP id 577DE15C064 for ; Thu, 10 Feb 2011 13:37:13 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:from :subject:date:message-id:content-type:mime-version:in-reply-to; q=dns; s=thelastpickle.com; b=AjyfBcCPflin966+hhT8KZGsjTVTybki6 uOOdpLXp6i//f8rF8+aCQGI2k1XOznAZHhY/hdm2lnUiYAy2vj7WoCtb7G7lK7G+ +AvEIXN6Lwh6jNYR7K1/5j74Yqug4f8+0Av9fyV+Z2DVNz3f1bPytajE0KI1RJbb 7g+1EKv8is= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to :from:subject:date:message-id:content-type:mime-version: in-reply-to; s=thelastpickle.com; bh=YBfQvC1nm/DKiamSdz7i3Nym8TE =; b=40mJUd+IleF80OruRp84Z7Vw1a1ruIZaqf+8T4jqVlFxiTkqN1qpPoLHlXR hUnIjPmTLJWJsbxPNkAQAwxb8g2gTFgyMqxnv6b38jlE3+Ku/xZjw3K/duOdznPN 36W1dwJ3FfwZU+P7WU3nCFMSKoKa6r9ILnjiyhVP2ztLlbM0= Received: from localhost (webms.mac.com [17.148.16.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a78.g.dreamhost.com (Postfix) with ESMTPSA id 3F33B15C059 for ; Thu, 10 Feb 2011 13:37:13 -0800 (PST) To: user@cassandra.apache.org From: Aaron Morton Subject: Re: Super Slow Multi-gets Date: Thu, 10 Feb 2011 21:37:12 GMT X-Mailer: MobileMe Mail (1C3224) Message-id: <9150ac4d-3441-648e-f90e-86555cfaa2fa@me.com> Content-Type: multipart/alternative; boundary=Apple-Webmail-42--d725df0c-926e-cf6a-c414-c351d8a99fc6 MIME-Version: 1.0 In-Reply-To: X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Webmail-42--d725df0c-926e-cf6a-c414-c351d8a99fc6 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1; format=flowed Assuming cassandra 0.7 in log4j-server.properties make it look like this..= =0A=0Alog4j.rootLogger=3DDEBUG,stdout,R=0A=0A=0AA=0AOn 11 Feb, 2011,at 10= :30 AM, Bill Speirs wrote:=0A=0AI switched my impl= ementation to use a thread pool of 10 threads each=0Amulti-getting 10 keys= /rows. This reduces my time from 50s to 5s for=0Afetching all 1,000 messag= es.=0A=0AI started looking through the Cassandra source to find where the=0A= parallel requests are actually made, and I believe it's in=0Aorg.apache.ca= ssandra.service.StorageProxy.java fetchRows, is this=0Acorrect? I noticed = a number of logger.debug calls, what do I need to=0Aset in my log4j.proper= ties file to see these messages as they would=0Aprobably help me determine= what is taking so long. Currently my=0Alog4j.properties file looks like t= his and I'm not seeing these=0Amessages:=0A=0Alog4j.appender.stdout=3Dorg.= apache.log4j.ConsoleAppender=0Alog4j.appender.stdout.layout=3Dorg.apache.l= og4j.SimpleLayout=0Alog4j.category.org.apache=3DDEBUG, stdout=0Alog4j.cate= gory.me.prettyprint=3DDEBUG, stdout=0A=0AThanks...=0A=0ABill-=0A=0A=0AOn T= hu, Feb 10, 2011 at 12:53 PM, Bill Speirs wrote:=0A= > Each message row is well under 1K. So I don't think it is network... plu= s=0A> all boxes are on a fast LAN.=0A>=0A> Bill-=0A>=0A> On Feb 10, 2011 1= 1:59 AM, "Utku Can Top=E7u" wrote:=0A>> Dear Bill,=0A>= >=0A>> How about the size of the row in the Messages CF. Is it too big? Mi= ght you=0A>> be having an overhead of the bandwidth?=0A>>=0A>> Regards,=0A= >> Utku=0A>>=0A>> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs =0A>> wrote:=0A>>=0A>>> I have a 7 node setup with a replicati= on factor of 1 and a read=0A>>> consistency of 1. I have two column famili= es: Messages which stores=0A>>> millions of rows with a UUID for the row k= ey, DateIndex which stores=0A>>> thousands of rows with a String as the ro= w key. I perform 2 look-ups=0A>>> for my queries:=0A>>>=0A>>> 1) Fetch the= row from DateIndex that includes the date I'm looking=0A>>> for. This ret= urns 1,000 columns where the column names are the UUID of=0A>>> the messag= es=0A>>> 2) Do a multi-get (Hector client) using those 1,000 row keys I go= t=0A>>> from the first query.=0A>>>=0A>>> Query 1 is taking ~300ms to fetc= h 1,000 columns from a single row...=0A>>> respectable. However, query 2 i= s taking over 50s to perform 1,000 row=0A>>> look-ups! Also, when I scale = down to 100 row look-ups for query 2, the=0A>>> time scales in a similar f= ashion, down to 5s.=0A>>>=0A>>> Am I doing something wrong here? It seems = like taking 5s to look-up=0A>>> 100 rows in a distributed hash table is wa= y too slow.=0A>>>=0A>>> Thoughts?=0A>>>=0A>>> Bill-=0A>>>=0A>=0A --Apple-Webmail-42--d725df0c-926e-cf6a-c414-c351d8a99fc6 Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--d725df0c-926e-cf6a-c414-c351d8a99fc6 --Apple-Webmail-86--d725df0c-926e-cf6a-c414-c351d8a99fc6 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=ISO-8859-1;
Assuming cassandra 0.7 in log4j-server.properties make it look like t= his...

log4j.rootLogger=3DDEBUG,stdout,R

<= /div>

A
On 11 Feb, 2011,at 10:30 AM, Bill Speirs <= ;bill.speirs@gmail.com> wrote:

I switched my impleme= ntation to use a thread pool of 10 threads each
=0Amulti-getting 10 key= s/rows. This reduces my time from 50s to 5s for
=0Afetching all 1,000 m= essages.
=0A
=0AI started looking through the Cassandra source to fi= nd where the
=0Aparallel requests are actually made, and I believe it's= in
=0Aorg.apache.cassandra.service.StorageProxy.java fetchRows, is thi= s
=0Acorrect? I noticed a number of logger.debug calls, what do I need = to
=0Aset in my log4j.properties file to see these messages as they wou= ld
=0Aprobably help me determine what is taking so long. Currently my=0Alog4j.properties file looks like this and I'm not seeing these
=0A= messages:
=0A
=0Alog4j.appender.stdout=3Dorg.apache.log4j.ConsoleApp= ender
=0Alog4j.appender.stdout.layout=3Dorg.apache.log4j.SimpleLayout=0Alog4j.category.org.apache=3DDEBUG, stdout
=0Alog4j.category.me.pre= ttyprint=3DDEBUG, stdout
=0A
=0AThanks...
=0A
=0ABill-
=0A<= br>=0A
=0AOn Thu, Feb 10, 2011 at 12:53 PM, Bill Speirs <b= ill.speirs@gmail.com> wrote:
=0A> Each message row is well un= der 1K. So I don't think it is network... plus
=0A> all boxes are on= a fast LAN.
=0A>
=0A> Bill-
=0A>
=0A> On Feb 10, = 2011 11:59 AM, "Utku Can Top=E7u" <utku@topcu.gen.tr> wrote:=0A>> Dear Bill,
=0A>>
=0A>> How about the size = of the row in the Messages CF. Is it too big? Might you
=0A>> be = having an overhead of the bandwidth?
=0A>>
=0A>> Regards= ,
=0A>> Utku
=0A>>
=0A>> On Thu, Feb 10, 2011 a= t 5:00 PM, Bill Speirs <bill.speirs@gmail.com>
=0A&= gt;> wrote:
=0A>>
=0A>>> I have a 7 node setup wit= h a replication factor of 1 and a read
=0A>>> consistency of 1= I have two column families: Messages which stores
=0A>>> mil= lions of rows with a UUID for the row key, DateIndex which stores
=0A&g= t;>> thousands of rows with a String as the row key. I perform 2 loo= k-ups
=0A>>> for my queries:
=0A>>>
=0A>>= > 1) Fetch the row from DateIndex that includes the date I'm looking=0A>>> for. This returns 1,000 columns where the column names ar= e the UUID of
=0A>>> the messages
=0A>>> 2) Do a m= ulti-get (Hector client) using those 1,000 row keys I got
=0A>>&g= t; from the first query.
=0A>>>
=0A>>> Query 1 is = taking ~300ms to fetch 1,000 columns from a single row...
=0A>>&g= t; respectable. However, query 2 is taking over 50s to perform 1,000 row=0A>>> look-ups! Also, when I scale down to 100 row look-ups fo= r query 2, the
=0A>>> time scales in a similar fashion, down t= o 5s.
=0A>>>
=0A>>> Am I doing something wrong her= e? It seems like taking 5s to look-up
=0A>>> 100 rows in a dis= tributed hash table is way too slow.
=0A>>>
=0A>>>= Thoughts?
=0A>>>
=0A>>> Bill-
=0A>>><= br>=0A>
=0A
--Apple-Webmail-86--d725df0c-926e-cf6a-c414-c351d8a99fc6-- --Apple-Webmail-42--d725df0c-926e-cf6a-c414-c351d8a99fc6--