Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 77063 invoked from network); 17 Aug 2010 01:23:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Aug 2010 01:23:57 -0000 Received: (qmail 37732 invoked by uid 500); 17 Aug 2010 01:23:56 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 37714 invoked by uid 500); 17 Aug 2010 01:23:55 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 37706 invoked by uid 99); 17 Aug 2010 01:23:55 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Aug 2010 01:23:55 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of fozziethebeat@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-ww0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Aug 2010 01:23:34 +0000 Received: by wwi17 with SMTP id 17so348163wwi.25 for ; Mon, 16 Aug 2010 18:23:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=zufHjO9d7lHah5xXWHCgHIsUb2nRurtViFHEz/7PldQ=; b=n+yP1uC4Ye1ykZ9W6EzeNms++O5yGWTUYWmlX8SkVhQ9EoHnVpPsWuqM08Z5F9PnUZ uTD1WUFUEjTEG9zOJTSytydguY25SQ6F7IGb1Gjfs9L9aVJrg6MWqmGSpu2QvAl7+vJk UJWkmDrNf2kzyj0gcogJuawKQTOGUeAWQhes4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=RXuTgBN3tf5CLVNA2tBWu6P7o+H+HOXsWwNB7fhwr6au4uS0+jNKfG+MViN4Ozj4zw phKL3O1+1RPwKnxbjPDBIunEgxvS69cgOdGKAgqt20K/z0N3yBvtdbSgEjimyhVHgu7Q cyE5l9a7JPk4PFL5MwqW5SDbDeMujphB1Cir0= MIME-Version: 1.0 Received: by 10.216.144.22 with SMTP id m22mr214781wej.0.1282008193820; Mon, 16 Aug 2010 18:23:13 -0700 (PDT) Received: by 10.216.154.10 with HTTP; Mon, 16 Aug 2010 18:23:13 -0700 (PDT) In-Reply-To: References: Date: Mon, 16 Aug 2010 18:23:13 -0700 Message-ID: Subject: Re: The stability of Hadoop jobs outputting to Cassandra From: Keith Stevens To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0016e6d58a78fde9be048dfacad1 X-Virus-Checked: Checked by ClamAV on apache.org --0016e6d58a78fde9be048dfacad1 Content-Type: text/plain; charset=ISO-8859-1 Thanks for this update. After another day at work, and more reading into the Cassandra's underlying model, I think the problem i am encountering is less due to HBase and more with user error and a highly faulty cluster. Cassandra's clean api and integration with thrift were the two biggest factors that attracted me to it, in addition to personal vouches from people at my university. Another attractor is based on several remarks that it was much simpler to set up than HBase, which has been our main point of failur. I have read, though, that Cassandra is not as focused on large scale analysis of documents, via hadoop, in the way that HBase is. I'm going to try playing around with Cassandra over the next few days and see if it's more stable on our often failing cluster when combined with Hadoop. I'll definitely try the simple solution of having a thrift connection to Cassandra in the reducer. Thanks! --Keith On Sun, Aug 15, 2010 at 6:05 PM, Jonathan Ellis wrote: > Status: Fixed, Fix version: 7.0 beta 1 means it's in the beta1 that > was just released, although > https://issues.apache.org/jira/browse/CASSANDRA-1315 is open to change > the API slightly. Either way, it won't be backported to 0.6. > > But you can write to Cassandra from the Hadoop job just fine w/o an > OutputFormat. Just create a Thrift connection in your reduce job. > > On Sun, Aug 15, 2010 at 6:24 PM, Keith Stevens > wrote: > > Hello, > > I'm currently working on a project that is using HBase and Hadoop, but > i'm > > currently looking into alternatives to HBase. Cassandra seems to be the > > next best replacement, or perhaps a better replacement, except that the > > stable release is lacking support for hadoop jobs writing to Cassandra. > > I found CASSANDRA-1101 and wanted to know how stable that update is. > Will > > it be made part of a release any time soon? Has anyone been using the > > update regularly? > > Thanks! > > --Keith > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > --0016e6d58a78fde9be048dfacad1 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks for this update.

After another day at work, and m= ore reading into the Cassandra's underlying model, I think the problem = i am encountering is less due to HBase and more with user error and a highl= y faulty cluster. =A0
=A0
Cassandra's clean api and integration with thrift we= re the two biggest factors that attracted me to it, in addition to personal= vouches from people at my university. =A0Another attractor is based on sev= eral remarks that it was much=A0simpler=A0to set up than HBase, which has b= een our main point of failur. =A0I have read, though, that Cassandra is not= as focused on large scale analysis of documents, via hadoop, in the way th= at HBase is.

I'm going to try playing around with Cassandra over= the next few days and see if it's more stable on our often failing clu= ster when combined with Hadoop. =A0I'll=A0definitely=A0try the simple s= olution of having a thrift connection to Cassandra in the reducer.

Thanks!
--Keith

On Sun, Aug 15, 2010 at 6:05 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
Status: Fixed, Fix version: 7.0 beta 1 mean= s it's in the beta1 that
was just released, although
https://issues.apache.org/jira/browse/CASSANDRA-1315 is open to= change
the API slightly. =A0Either way, it won't be backported to 0.6.

But you can write to Cassandra from the Hadoop job just fine w/o an
OutputFormat. =A0Just create a Thrift connection in your reduce job.

On Sun, Aug 15, 2010 at 6:24 PM, Keith Stevens <fozziethebeat@gmail.com> wrote:
> Hello,
> I'm currently working on a project that is using HBase and Hadoop,= but i'm
> currently looking into alternatives to HBase. =A0Cassandra seems to be= the
> next best replacement, or perhaps a better replacement, except that th= e
> stable release is lacking support for hadoop jobs writing to Cassandra= .
> I found=A0CASSANDRA-1101=A0and wanted to know how stable that update i= s. =A0Will
> it be made part of a release any time soon? =A0Has anyone been using t= he
> update regularly?
> Thanks!
> --Keith



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

--0016e6d58a78fde9be048dfacad1--