Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 73573 invoked from network); 30 Sep 2010 20:09:36 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Sep 2010 20:09:36 -0000 Received: (qmail 30232 invoked by uid 500); 30 Sep 2010 20:09:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 30129 invoked by uid 500); 30 Sep 2010 20:09:34 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 30121 invoked by uid 99); 30 Sep 2010 20:09:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Sep 2010 20:09:34 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [69.89.66.126] (HELO hosting.clearpathnet.com) (69.89.66.126) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Sep 2010 20:09:28 +0000 Received: from ismtp.cpn.net (splanfs01 [10.101.10.5]) by hosting.clearpathnet.com (8.13.1/8.13.1) with ESMTP id o8UK8Weo002533 for ; Thu, 30 Sep 2010 13:08:32 -0700 Received: from exchange.clearpathnet.com (splaexch01 [10.2.4.105]) by ismtp.cpn.net (8.13.8/8.13.8) with ESMTP id o8UK957R029478 for ; Thu, 30 Sep 2010 13:09:05 -0700 Received: from splaexch01.cpn.net ([192.168.254.105]) by splaexch01.cpn.net ([192.168.254.105]) with mapi; Thu, 30 Sep 2010 13:08:37 -0700 From: Alex Burkoff To: "user@cassandra.apache.org" Date: Thu, 30 Sep 2010 13:08:37 -0700 Subject: RE: ColumnFamilyOutputFormat and mapreduce.output.columnfamilyoutputformat.batch.threshold 0.70beta1 Thread-Topic: ColumnFamilyOutputFormat and mapreduce.output.columnfamilyoutputformat.batch.threshold 0.70beta1 Thread-Index: ActQNkYcj6JzgodYR5axVFC1zEPBDwQoFmO4 Message-ID: <6AE62907C5654F46B056A3AB96A6E00F130AF11318@splaexch01.cpn.net> References: <6AE62907C5654F46B056A3AB96A6E00F12F8AEEF9C@splaexch01.cpn.net> <6AE62907C5654F46B056A3AB96A6E00F12F8AEEF9D@splaexch01.cpn.net>, <6AE62907C5654F46B056A3AB96A6E00F12F8AEEF9E@splaexch01.cpn.net>,<393CA1F2-27DD-4C38-BACA-4618C845E3EC@gmail.com> In-Reply-To: <393CA1F2-27DD-4C38-BACA-4618C845E3EC@gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Okay, I've tested again with the latest build. I am using cleanup() reducer= 's method to write 10 rows with 2 columns each. With default value of mapreduce.output.columnfamilyoutputformat.batch.thres= hold, 10 rows are written with second (!!!) column only. When mapreduce.output.columnfamilyoutputformat.batch.threshold is set to 1,= it writes 10 rows with the first (!!!!) column only. Any ideas ? :) ________________________________________ From: Jeremy Hanna [jeremy.hanna1234@gmail.com] Sent: Thursday, September 09, 2010 8:46 AM To: user@cassandra.apache.org Subject: Re: ColumnFamilyOutputFormat and mapreduce.output.columnfamilyoutp= utformat.batch.threshold 0.70beta1 When Jonathan said don't build trunk thrift, he meant just thrift - apply t= he patches against cassandra trunk. You shouldn't need to build the thrift= bindings. On Sep 8, 2010, at 10:54 PM, Alex Burkoff wrote: > Well, 7.0beta1 rejects those patches. Is there a specific revision I can = try > applying them to ? > > Alex. > ________________________________________ > From: Jonathan Ellis [jbellis@gmail.com] > Sent: Wednesday, September 08, 2010 6:48 PM > To: user@cassandra.apache.org > Subject: Re: ColumnFamilyOutputFormat and mapreduce.output.columnfamilyou= tputformat.batch.threshold 0.70beta1 > > You can't build Cassandra against trunk thrift, the API has changed. > Stick to the one shipped w/ Cassandra and you will be fine. > > On Wed, Sep 8, 2010 at 5:41 PM, Alex Burkoff = wrote: >> With the trunk version and given patches I am now getting following exce= ption: >> >> 10/09/08 22:39:14 WARN mapred.LocalJobRunner: job_local_0001 >> java.lang.ClassCastException: [B cannot be cast to java.nio.ByteBuffer >> at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(Colu= mnFamilyRecordWriter.java:68) >> at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.wri= te(ReduceTask.java:508) >> at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskIn= putOutputContext.java:80) >> at cassandratest.Main$TReducer.reduce(Main.java:132) >> at cassandratest.Main$TReducer.reduce(Main.java:113) >> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) >> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.ja= va:566) >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) >> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.= java:216) >> 10/09/08 22:39:14 INFO mapred.JobClient: map 100% reduce 0% >> >> Alex. >> ________________________________________ >> From: Jonathan Ellis [jbellis@gmail.com] >> Sent: Wednesday, September 08, 2010 2:26 PM >> To: user@cassandra.apache.org >> Subject: Re: ColumnFamilyOutputFormat and mapreduce.output.columnfamilyo= utputformat.batch.threshold 0.70beta1 >> >> Try the patches on >> https://issues.apache.org/jira/browse/CASSANDRA-1434 (or wait until >> they're committed to trunk, then try a nightly build) >> >> On Wed, Sep 8, 2010 at 4:18 PM, Alex Burkoff = wrote: >>> Guys, >>> >>> I was testing ColumnFamilyOutputFormat and found that only columns from= the last Reduce >>> invocation get stored when mapreduce.output.columnfamilyoutputformat.ba= tch.threshold has >>> the default value. Setting it to 1 changes the behavior, and all data i= s stored then. Is it the >>> intended behavior, or am I missing something ? >>> >>> Best regards, >>> >>> Alex Burkoff >>> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com