Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1034B95EE for ; Tue, 6 Mar 2012 08:33:58 +0000 (UTC) Received: (qmail 85165 invoked by uid 500); 6 Mar 2012 08:33:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 85121 invoked by uid 500); 6 Mar 2012 08:33:55 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 85107 invoked by uid 99); 6 Mar 2012 08:33:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 08:33:55 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of patrik.modesto@gmail.com designates 209.85.215.44 as permitted sender) Received: from [209.85.215.44] (HELO mail-lpp01m010-f44.google.com) (209.85.215.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Mar 2012 08:33:49 +0000 Received: by lagj5 with SMTP id j5so6998996lag.31 for ; Tue, 06 Mar 2012 00:33:29 -0800 (PST) Received-SPF: pass (google.com: domain of patrik.modesto@gmail.com designates 10.112.30.73 as permitted sender) client-ip=10.112.30.73; Authentication-Results: mr.google.com; spf=pass (google.com: domain of patrik.modesto@gmail.com designates 10.112.30.73 as permitted sender) smtp.mail=patrik.modesto@gmail.com; dkim=pass header.i=patrik.modesto@gmail.com Received: from mr.google.com ([10.112.30.73]) by 10.112.30.73 with SMTP id q9mr11216742lbh.30.1331022809484 (num_hops = 1); Tue, 06 Mar 2012 00:33:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=ni/r5mnY9gW3VdBXad3N25iAFaos3ZIlS1XRXnIvQhQ=; b=auvozuPen96gdkyD2bekMIER/LRTIbS9/qMZv59HEJaA6XM35Xehgc504dkyUuw/UF YMMc5Qii/rL4zUnPe5ge+U3JSvNBaStD1tHRA7mdkqK9Y/fJvGEaKZIeASclvm0UYl+l yRJQwGMkL/41z/pXd5JPFo6gQ2ilpGTo+mQorgGu4Av9RRcqQnsLmP2OxPuUDGKe4TWz 9x3DcVSJ1OsPhuAyoCz39T8s25rSyn0Mu6U0on1Vtk+i/hmVgVpQ0LrezYGBcZDf63jb QdZbGCTfcqgRs1NAvf4SghoriVBtnPl8Y5K/nFRuCNkP6fzlURonRlUv0aSuG6GCKo6g ykEw== Received: by 10.112.30.73 with SMTP id q9mr9174708lbh.30.1331022809391; Tue, 06 Mar 2012 00:33:29 -0800 (PST) MIME-Version: 1.0 Received: by 10.152.129.131 with HTTP; Tue, 6 Mar 2012 00:32:59 -0800 (PST) In-Reply-To: References: <76B06293-E79F-44A6-8490-788207BE26C8@gmail.com> <2C3D6FA9-3FC3-42B7-81D4-040EEB795E6C@gmail.com> <5A48F325-00B0-4783-9EE1-5DA3F7A146FB@thelastpickle.com> From: Patrik Modesto Date: Tue, 6 Mar 2012 09:32:59 +0100 Message-ID: Subject: Re: newer Cassandra + Hadoop = TimedOutException() To: user@cassandra.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi, I was recently trying Hadoop job + cassandra-all 0.8.10 again and the Timeouts I get are not because of the Cassandra can't handle the requests. I've noticed there are several tasks that show proggess of several thousands percents. Seems like they are looping their range of keys. I've run the job with debug enabled and the ranges look ok, see http://pastebin.com/stVsFzLM Another difference between cassandra-all 0.8.7 and 0.8.10 is the number of mappers the job creates: 0.8.7: 4680 0.8.10: 595 Task Complete task_201202281457_2027_m_000041 9076.81% task_201202281457_2027_m_000073 9639.04% task_201202281457_2027_m_000105 10538.60% task_201202281457_2027_m_000108 9364.17% None of this happens with cassandra-all 0.8.7. Regards, P. On Tue, Feb 28, 2012 at 12:29, Patrik Modesto wr= ote: > I'll alter these settings and will let you know. > > Regards, > P. > > On Tue, Feb 28, 2012 at 09:23, aaron morton wro= te: >> Have you tried lowering the =C2=A0batch size and increasing the time out= ? Even >> just to get it to work. >> >> If you get a TimedOutException it means CL number of servers did not res= pond >> in time. >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 28/02/2012, at 8:18 PM, Patrik Modesto wrote: >> >> Hi aaron, >> >> this is our current settings: >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cassandra.ra= nge.batch.size >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A01024 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cassandra.in= put.split.size >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A016384 >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 >> >> rpc_timeout_in_ms: 30000 >> >> Regards, >> P. >> >> On Mon, Feb 27, 2012 at 21:54, aaron morton wr= ote: >> >> What settings do you have for=C2=A0cassandra.range.batch.size >> >> and=C2=A0rpc_timeout_in_ms =C2=A0? Have you tried reducing the first and= /or increasing >> >> the second ? >> >> >> Cheers >> >> >> ----------------- >> >> Aaron Morton >> >> Freelance Developer >> >> @aaronmorton >> >> http://www.thelastpickle.com >> >> >> On 27/02/2012, at 8:02 PM, Patrik Modesto wrote: >> >> >> On Sun, Feb 26, 2012 at 04:25, Edward Capriolo >> >> wrote: >> >> >> Did you see the notes here? >> >> >> >> I'm not sure what do you mean by the notes? >> >> >> I'm using the mapred.* settings suggested there: >> >> >> =C2=A0=C2=A0=C2=A0=C2=A0 >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0mapred.max.tracker= .failures >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A020 >> >> =C2=A0=C2=A0=C2=A0=C2=A0 >> >> =C2=A0=C2=A0=C2=A0=C2=A0 >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0mapred.map.max.att= empts >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A020 >> >> =C2=A0=C2=A0=C2=A0=C2=A0 >> >> =C2=A0=C2=A0=C2=A0=C2=A0 >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0mapred.reduce.max.= attempts >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A020 >> >> =C2=A0=C2=A0=C2=A0=C2=A0 >> >> >> But I still see the timeouts that I haven't with cassandra-all 0.8.7. >> >> >> P. >> >> >> http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting >> >> >> >>