From user-return-25995-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue May 8 13:48:30 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 04839C38C for ; Tue, 8 May 2012 13:48:30 +0000 (UTC) Received: (qmail 50144 invoked by uid 500); 8 May 2012 13:48:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 50126 invoked by uid 500); 8 May 2012 13:48:27 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 50116 invoked by uid 99); 8 May 2012 13:48:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 May 2012 13:48:27 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of zamith.28@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-we0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 May 2012 13:48:20 +0000 Received: by werf13 with SMTP id f13so3418034wer.31 for ; Tue, 08 May 2012 06:47:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:mime-version:content-type:subject:date:in-reply-to:to :references:message-id:x-mailer; bh=UR+mwA2aSCtQVWKMb3eKWVtfpxhuR6XfmGMBuDn6r68=; b=U0/CIeu/7MEVipxr9RJFOxDYKPJ7p6FbPO9vjbVFloUis52ULE2me8ALU8SJxL1DCc RdP0ydsTzn+gt3qWOT8cCEQeTSAUKkTnusXIjC0c4tT4ix+0cv0DTMet2xXM7m58dngS G+usF3/P4dj2CYswpoaTv+Vs75kkdRa9tWq/BwfDCdV6F28OKmKpVEL/lx+7/umNdwTQ Rf9zaiu/i1QF/bBYM1h1vfFNMKfXNfrYgd0qcw5rRetlLsgBdNi8i+g/sTiaX72iEcYc kJvCOOfw/0oeFPmaSFfj5twsHnunCnI0WO+waFbmXqkGOR0oexrZIm04SEhQkSgatRPT mi9A== Received: by 10.180.89.9 with SMTP id bk9mr44863132wib.11.1336484879507; Tue, 08 May 2012 06:47:59 -0700 (PDT) Received: from dhc130.lsd.di.uminho.pt (linux.di.uminho.pt. [193.136.19.96]) by mx.google.com with ESMTPS id ca3sm29194503wib.6.2012.05.08.06.47.57 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 08 May 2012 06:47:58 -0700 (PDT) From: =?iso-8859-1?Q?Lu=EDs_Ferreira?= Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-14--452092928 Subject: Re: Timeout Exception in get_slice Date: Tue, 8 May 2012 14:47:56 +0100 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: <2A0E1370-1A81-4F3F-AFD7-A39140303558@gmail.com> X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-14--452092928 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Maybe one of the problems is that I am reading the columns in a row and = the rows themselves in batches, using the count attribute in the = SliceRange and by changing the start column or the corresponding for = rows with the KeyRange. According to your blog post, using start key to = read for millions of rows/columns has a lot of latency, but how else can = I read an entire row that does not fit into memory? I'll have to run some tests again and check the tpstats. Still, do you = think that adding more machines to the cluster will help a lot? I say = this, because I started with a 3 node cluster and have scaled to a 5 = node cluster with little improvement...=20 Thanks anyway. On May 8, 2012, at 9:54 AM, aaron morton wrote: > If I was rebuilding my power after spending the first thousand years = of the Third Age as a shapeless evil I would cast my Eye of Fire in the = direction of the filthy little multi_gets.=20 >=20 > A node can fail to respond to a query with rpc_timeout for two = reasons: either the command did not run or the command started but did = not complete. The former is much more likely. If it is happening you = will see large pending counts and dropped messages in nodetool tpstats, = you will also see log entries about dropped messages. >=20 > When you send a multi_get each row you request becomes a message in = the read thread pool. If you request 100 rows you will put 100 messages = in the pool, which by default has 32 threads. If some clients are = sending large multi get (or batch mutations) you can overload nodes and = starve other clients.=20 >=20 > for background, some metrics here for selecting from 10million columns = http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ >=20 > Hope that helps.=20 >=20 >=20 > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com >=20 > On 6/05/2012, at 7:14 AM, Lu=EDs Ferreira wrote: >=20 >> Hi,=20 >>=20 >> I'm doing get_slice on huge rows (3 million columns) and even though = I am doing it iteratively I keep getting TimeoutExceptions. I've tried = to change the number of columns fetched but it did not work.=20 >>=20 >> I have a 5 machine cluster, each with 4GB of which 3 are dedicated to = cassandra's heap, but still the all consume all of the memory and get = huge IO wait due to the amout of reads. >>=20 >> I am running tests with 100 clients all performing multiple = operations mostly get_slice, get and multi_get, but the timeouts only = occur in the get_slice. >>=20 >> Does this have anything to do with cassandra's ability (or lack = thereof) to keep the rows in memory? Or am I doing anything wrong? Any = tips? >>=20 >> Cumpliments, >> Lu=EDs Ferreira >>=20 >>=20 >>=20 >>=20 >=20 Cumprimentos, Lu=EDs Ferreira --Apple-Mail-14--452092928 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1
If I was rebuilding my = power after spending the first thousand years of the Third Age as a = shapeless evil I would cast my Eye of Fire in the direction of the = filthy little multi_gets. 

A node can fail to = respond to a query with rpc_timeout for two reasons: either the command = did not run or the command started but did not complete. The former is = much more likely. If it is happening you will see  large pending = counts and dropped messages in nodetool tpstats, you will also see log = entries about dropped messages.

When you send a = multi_get each row you request becomes a message in the read thread = pool. If you request 100 rows you will put 100 messages in the pool, = which by default has 32 threads. If some clients are sending large multi = get (or batch mutations) you can overload nodes and starve other = clients. 

for background, some metrics = here for selecting from 10million columns http:/= /thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

Hope that helps. 


http://www.thelastpickle.com

On 6/05/2012, at 7:14 AM, Lu=EDs Ferreira wrote:


Cumpliments,
Lu=EDs = Ferreira







= --Apple-Mail-14--452092928--