Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 34D926F54 for ; Fri, 15 Jul 2011 08:35:26 +0000 (UTC) Received: (qmail 57250 invoked by uid 500); 15 Jul 2011 08:35:23 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 56953 invoked by uid 500); 15 Jul 2011 08:35:08 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 56933 invoked by uid 99); 15 Jul 2011 08:35:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 08:35:05 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of markus.mock@gmail.com designates 74.125.83.172 as permitted sender) Received: from [74.125.83.172] (HELO mail-pv0-f172.google.com) (74.125.83.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 08:34:57 +0000 Received: by pvh18 with SMTP id 18so1360619pvh.31 for ; Fri, 15 Jul 2011 01:34:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:reply-to:date:message-id:subject:from:to:content-type; bh=rTLr9f8TuMF+cq4YO3OaztAkl1cRMFeqSFLHAmKMjGA=; b=gAN6Ot/emSanCUDNSmVuCh6XWLuxR5TdZztKy7jgdCbZ7WAZrlPMSQzc6S4zHdA5hE B1WgfKinxIKgx+H9iw59yKLbIWwxCHzjfmYkeemnIAvWiaWTNSLKi2ugZTeK0MqEGXr+ MKNDQVEuRD3RhMVt2Qj7uephLIZggUCETcPtY= MIME-Version: 1.0 Received: by 10.68.42.230 with SMTP id r6mr3793117pbl.79.1310718876151; Fri, 15 Jul 2011 01:34:36 -0700 (PDT) Received: by 10.68.51.66 with HTTP; Fri, 15 Jul 2011 01:34:36 -0700 (PDT) Reply-To: markus.mock@gmail.com Date: Fri, 15 Jul 2011 10:34:36 +0200 Message-ID: Subject: Multiple input column families in Cassandra Hadoop mapreduce From: Markus Mock To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=bcaec544ef6e036f1804a8178528 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec544ef6e036f1804a8178528 Content-Type: text/plain; charset=ISO-8859-1 Hello, with org.apache.cassandra.hadoop.ConfigHelper.setInputColumnFamily I can set up the map phase to read from one column family. Is it possible to have multiple mapper classes each mapping over their own column family so that data from multiple column families can be "joined" in the reduce phase? I didn't find any documentation on how to do that. One workaround I see is to do several MRs write the data from the different column families in a single helper column family and then do the desired computation but I am trying to avoid that if possible. Any suggestions on how to do this without running multiple MRs and instead read from multiple column families in one go? Thanks. -- Markus --bcaec544ef6e036f1804a8178528 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hello,

with org.apache.cassandra.hadoop.ConfigHelper.set= InputColumnFamily I can set up the map phase to read from one column family= . Is it possible to have multiple mapper classes each mapping over their ow= n column family so that data from multiple column families can be "joi= ned" in the reduce phase? I didn't find any documentation on how t= o do that.

One workaround I see is to do several MRs write the dat= a from the different column families in a single helper column family and t= hen do the desired computation but I am trying to avoid that if possible.= =A0Any suggestions on how to do this without running multiple MRs and inste= ad read from multiple column families in one go?

Thanks.

=A0 -- Markus

--bcaec544ef6e036f1804a8178528--