Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AA65773E1 for ; Wed, 17 Aug 2011 15:09:34 +0000 (UTC) Received: (qmail 97703 invoked by uid 500); 17 Aug 2011 15:09:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 97629 invoked by uid 500); 17 Aug 2011 15:09:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 97621 invoked by uid 99); 17 Aug 2011 15:09:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Aug 2011 15:09:31 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-wy0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Aug 2011 15:09:27 +0000 Received: by wyg8 with SMTP id 8so846437wyg.31 for ; Wed, 17 Aug 2011 08:09:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=8ie7BgPYL9uuzakCM2oX2RPQkxaOBmRGES3qgDKMj/k=; b=sQmxpppj/p03OYqjbmez/f6IB5UETu1fzfxi6zSFAf82QWb3g0Pz/MW+Uu8aOrTStD ogfXwdnS1JGp0kmKLowBouz4ivVuA2atO+XD+K/yPQAGOBZQaBtfiQMdOr3m+U5H9cvA yHj3x1OR7KDI7UVV12d5wu//6UVNDT/MxIyG4= Received: by 10.216.46.208 with SMTP id r58mr4230046web.78.1313593746085; Wed, 17 Aug 2011 08:09:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.38.131 with HTTP; Wed, 17 Aug 2011 08:08:46 -0700 (PDT) In-Reply-To: References: <1312319458.4058.4.camel@us-wash-ch2ljq1.morningstar.com> <813E81B0-6A8C-4DAD-AB85-A5BA9C734F28@thelastpickle.com> From: Jonathan Ellis Date: Wed, 17 Aug 2011 10:08:46 -0500 Message-ID: Subject: Re: RF=1 To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable See https://issues.apache.org/jira/browse/CASSANDRA-2388 On Wed, Aug 17, 2011 at 6:28 AM, Patrik Modesto wrote: > Hi, > > while I was investigating this issue, I've found that hadoop+cassandra > don't work if you stop even just one node in the cluster. It doesn't > depend on RF. ColumnFamilyRecordReader gets list of nodes (acording > the RF) but chooses just the local host and if there is no cassandra > running localy it throws RuntimeError exception. Which in turn marks > the MapReduce task as failed. > > I've created a patch that makes ColumnFamilyRecordReader to try the > local node and if it fails tries the other nodes in it's list. The > patch is here http://pastebin.com/0RdQ0HMx I think attachements are > not allowed on this ML. > > Please test it and apply. It's for 0.7.8 version. > > Regards, > P. > > > On Wed, Aug 3, 2011 at 13:59, aaron morton wrot= e: >> If you want to take a look o.a.c.hadoop.ColumnFamilyRecordReader.getSpli= ts() is the function that gets the splits. >> >> >> Cheers >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 3 Aug 2011, at 16:18, Patrik Modesto wrote: >> >>> On Tue, Aug 2, 2011 at 23:10, Jeremiah Jordan >>> wrote: >>>> If you have RF=3D1, taking one node down is going to cause 25% of your >>>> data to be unavailable. =A0If you want to tolerate a machines going do= wn >>>> you need to have at least RF=3D2, if you want to use quorum and have a >>>> machine go down, you need at least RF=3D3. >>> >>> I know I can have RF > 1 but I have limited resources and I don't care >>> lossing 25% of the data. RF > 1 basicaly means if a node goes down I >>> have the data elsewhere, but what I need is if node goes down just >>> ignore its range. I can handle it in my applications using thrift, but >>> the hadoop-mapreduce can't handle it. It just fails with "Exception in >>> thread "main" java.io.IOException: Could not get input splits". Is >>> there a way to say ignore this range to hadoop? >>> >>> Regards, >>> P. >> >> > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com