From user-return-26015-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu May 10 10:05:44 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1D68ACAE5 for ; Thu, 10 May 2012 10:05:44 +0000 (UTC) Received: (qmail 15378 invoked by uid 500); 10 May 2012 10:05:41 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 15000 invoked by uid 500); 10 May 2012 10:05:41 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 14986 invoked by uid 99); 10 May 2012 10:05:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 May 2012 10:05:40 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of zamith.28@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-wg0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 May 2012 10:05:34 +0000 Received: by wgbdr13 with SMTP id dr13so949941wgb.25 for ; Thu, 10 May 2012 03:05:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:mime-version:content-type:subject:date:in-reply-to:to :references:message-id:x-mailer; bh=bSpz4KWA7UH0nmAzBIN9cFosqQQbldZROfDdemQ9hlc=; b=OGwc72d9n8+f2vtPM8vFZp1Wj4lIJKoU1pnaJvAymg7aiG/u+AYP8XtnnFsFwhIVEk AdyJLjMdkT1Ai7f71J1NUilPL2uCbgW3tes0vWiC/K+A5iFCIBskK2F2AMTU+998cJ8g jqaPniKhBvkJruxMKGhVet0z5fXdGeX3hhoxDp7jNswoHF4vGsdfzn2kWSn9p3yxILdZ 6Cqb9v7hciUcasLqB1HGaQbX1RA9gki8e21U9V4QfKGkw7+lMqZpkLN/bbGqX1JIziwc sewZSLsmoaJ78neRpmetanOauuuF3DLQgJMJ5daTAC5x3ko1sBLznTWmYkIV/Of4DnL/ qvow== Received: by 10.180.83.38 with SMTP id n6mr8154828wiy.4.1336644311644; Thu, 10 May 2012 03:05:11 -0700 (PDT) Received: from dhc146.lsd.di.uminho.pt (linux.di.uminho.pt. [193.136.19.96]) by mx.google.com with ESMTPS id h8sm3723276wix.4.2012.05.10.03.05.09 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 10 May 2012 03:05:10 -0700 (PDT) From: =?iso-8859-1?Q?Lu=EDs_Ferreira?= Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-16--292660390 Subject: Re: Timeout Exception in get_slice Date: Thu, 10 May 2012 11:05:08 +0100 In-Reply-To: To: user@cassandra.apache.org References: <2A0E1370-1A81-4F3F-AFD7-A39140303558@gmail.com> Message-Id: <8E496D9D-CB60-4067-B0F2-6E9F5C572106@gmail.com> X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-16--292660390 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 The multi get batches range from 100 to 200. The tests I'm running need to do get_slices and the multigets on those = results. I can't turn either of them off. I was only setting 16 threads for reading, but I'll boost it up to 32 = and see what happens. On May 9, 2012, at 11:03 AM, aaron morton wrote: > How big are the multi get batches ? >=20 > How do the wide row get_slice calls behave when the multi gets are not = running ? >=20 > Cheers >=20 > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com >=20 > On 9/05/2012, at 1:47 AM, Lu=EDs Ferreira wrote: >=20 >> Maybe one of the problems is that I am reading the columns in a row = and the rows themselves in batches, using the count attribute in the = SliceRange and by changing the start column or the corresponding for = rows with the KeyRange. According to your blog post, using start key to = read for millions of rows/columns has a lot of latency, but how else can = I read an entire row that does not fit into memory? >>=20 >> I'll have to run some tests again and check the tpstats. Still, do = you think that adding more machines to the cluster will help a lot? I = say this, because I started with a 3 node cluster and have scaled to a 5 = node cluster with little improvement...=20 >>=20 >> Thanks anyway. >>=20 >> On May 8, 2012, at 9:54 AM, aaron morton wrote: >>=20 >>> If I was rebuilding my power after spending the first thousand years = of the Third Age as a shapeless evil I would cast my Eye of Fire in the = direction of the filthy little multi_gets.=20 >>>=20 >>> A node can fail to respond to a query with rpc_timeout for two = reasons: either the command did not run or the command started but did = not complete. The former is much more likely. If it is happening you = will see large pending counts and dropped messages in nodetool tpstats, = you will also see log entries about dropped messages. >>>=20 >>> When you send a multi_get each row you request becomes a message in = the read thread pool. If you request 100 rows you will put 100 messages = in the pool, which by default has 32 threads. If some clients are = sending large multi get (or batch mutations) you can overload nodes and = starve other clients.=20 >>>=20 >>> for background, some metrics here for selecting from 10million = columns http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ >>>=20 >>> Hope that helps.=20 >>>=20 >>>=20 >>> ----------------- >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>>=20 >>> On 6/05/2012, at 7:14 AM, Lu=EDs Ferreira wrote: >>>=20 >>>> Hi,=20 >>>>=20 >>>> I'm doing get_slice on huge rows (3 million columns) and even = though I am doing it iteratively I keep getting TimeoutExceptions. I've = tried to change the number of columns fetched but it did not work.=20 >>>>=20 >>>> I have a 5 machine cluster, each with 4GB of which 3 are dedicated = to cassandra's heap, but still the all consume all of the memory and get = huge IO wait due to the amout of reads. >>>>=20 >>>> I am running tests with 100 clients all performing multiple = operations mostly get_slice, get and multi_get, but the timeouts only = occur in the get_slice. >>>>=20 >>>> Does this have anything to do with cassandra's ability (or lack = thereof) to keep the rows in memory? Or am I doing anything wrong? Any = tips? >>>>=20 >>>> Cumpliments, >>>> Lu=EDs Ferreira >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>=20 >>=20 >> Cumprimentos, >> Lu=EDs Ferreira >>=20 >>=20 >>=20 >=20 Cumprimentos, Lu=EDs Ferreira --Apple-Mail-16--292660390 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 The = multi get batches range from 100 to 200.

The tests = I'm running need to do get_slices and the multigets on those results. I = can't turn either of them off.

I was only = setting 16 threads for reading, but I'll boost it up to 32 and see what = happens.

On May 9, 2012, at 11:03 AM, aaron = morton wrote:

How big are the multi = get batches ?

How do the wide row get_slice calls = behave when the multi gets are not running = ?

Cheers

http://www.thelastpickle.com

On 9/05/2012, at 1:47 AM, Lu=EDs Ferreira wrote:

Maybe one of the problems = is that I am reading the columns in a row and the rows themselves in = batches, using the count attribute in the SliceRange and by changing the = start column or the corresponding for rows with the KeyRange. According = to your blog post, using start key to read for millions of rows/columns = has a lot of latency, but how else can I read an entire row that does = not fit into memory?

I'll have to run some = tests again and check the tpstats. Still, do you think that adding more = machines to the cluster will help a lot? I say this, because I started = with a 3 node cluster and have scaled to a 5 node cluster with little = improvement... 

Thanks = anyway.

On May 8, 2012, at 9:54 AM, aaron morton = wrote:

If I was rebuilding my = power after spending the first thousand years of the Third Age as a = shapeless evil I would cast my Eye of Fire in the direction of the = filthy little multi_gets. 

A node can fail to = respond to a query with rpc_timeout for two reasons: either the command = did not run or the command started but did not complete. The former is = much more likely. If it is happening you will see  large pending = counts and dropped messages in nodetool tpstats, you will also see log = entries about dropped messages.

When you send a = multi_get each row you request becomes a message in the read thread = pool. If you request 100 rows you will put 100 messages in the pool, = which by default has 32 threads. If some clients are sending large multi = get (or batch mutations) you can overload nodes and starve other = clients. 

for background, some metrics = here for selecting from 10million columns http:/= /thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

Hope that helps. 


http://www.thelastpickle.com

On 6/05/2012, at 7:14 AM, Lu=EDs Ferreira wrote:


Cumpliments,
Lu=EDs = Ferreira






=



= --Apple-Mail-16--292660390--