Return-Path: Delivered-To: apmail-incubator-chukwa-dev-archive@www.apache.org Received: (qmail 97722 invoked from network); 5 Jan 2011 21:36:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Jan 2011 21:36:05 -0000 Received: (qmail 32255 invoked by uid 500); 5 Jan 2011 21:36:05 -0000 Delivered-To: apmail-incubator-chukwa-dev-archive@incubator.apache.org Received: (qmail 32218 invoked by uid 500); 5 Jan 2011 21:36:04 -0000 Mailing-List: contact chukwa-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-dev@incubator.apache.org Delivered-To: mailing list chukwa-dev@incubator.apache.org Received: (qmail 32209 invoked by uid 99); 5 Jan 2011 21:36:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Jan 2011 21:36:04 +0000 X-ASF-Spam-Status: No, hits=3.3 required=10.0 tests=HTML_MESSAGE,NO_RDNS_DOTCOM_HELO,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [69.147.107.20] (HELO mrout1-b.corp.re1.yahoo.com) (69.147.107.20) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Jan 2011 21:35:57 +0000 Received: from SP2-EX07CAS01.ds.corp.yahoo.com (sp2-ex07cas01.corp.sp2.yahoo.com [98.137.59.37]) by mrout1-b.corp.re1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id p05LXUnG017064 for ; Wed, 5 Jan 2011 13:33:30 -0800 (PST) Received: from SP2-EX07VS05.ds.corp.yahoo.com ([98.137.59.23]) by SP2-EX07CAS01.ds.corp.yahoo.com ([98.137.59.37]) with mapi; Wed, 5 Jan 2011 13:33:30 -0800 From: Eric Yang To: "chukwa-dev@incubator.apache.org" Date: Wed, 5 Jan 2011 13:33:28 -0800 Subject: Re: questions about pig Thread-Topic: questions about pig Thread-Index: AcuskyLfMHmV03BiQWS66FapRRNfKQAjQ32g Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_C94A23A8ED29eyangyahooinccom_" MIME-Version: 1.0 --_000_C94A23A8ED29eyangyahooinccom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I am also getting the same problem on my cluster. I will have a patch for = the empty row key problem soon. Regards, Eric On 1/4/11 8:42 PM, "Eric Yang" wrote: It looks like HbaseStorage is intentional to make empty row key invalid. Nothing can be done at script side to skip this. You should be able to fetch the empty row with: get 'SystemMetrics, '' in hbase shell. If something responded, then you need to delete this row. deleteall 'SystemMetrics', '' Question is, how do you end up with a empty row key? I can figure out how this is possible if the metrics are streamed by using SystemMetrics Adaptor. Any idea? regards, Eric On Tue, Jan 4, 2011 at 6:05 PM, Ariel Rabkin wrote: > Hm. > > Table is biggish; awk. to scan by hand. Can we modify the script to > ignore empty rows? > > --Ari > > On Tue, Jan 4, 2011 at 8:35 PM, Eric Yang wrote: >> This looks like the row key is empty after parsing. What does the row k= ey look like in SystemMetrics table? >> The expected format is: >> >> 1234567890000-hostname >> >> Make sure there is no empty row key in SystemMetrics table. >> >> Regards, >> Eric >> >> On 1/4/11 5:09 PM, "Ariel Rabkin" wrote: >> >> So I have pig+hbase running. Thanks so much! >> >> But now I get the following error, from the System Metrics aggregation: >> >> java.io.IOException: java.lang.IllegalArgumentException: Row key is inva= lid >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.P= igMapReduce$Reduce.runPipeline(PigMapReduce.java:438) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.P= igMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.P= igMapReduce$Reduce.reduce(PigMapReduce.java:381) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.P= igMapReduce$Reduce.reduce(PigMapReduce.java:251) >> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) >> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.j= ava:566) >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) >> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner= .java:216) >> Caused by: java.lang.IllegalArgumentException: Row key is invalid >> at org.apache.hadoop.hbase.client.Put.(Put.java:79) >> at org.apache.hadoop.hbase.client.Put.(Put.java:69) >> at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBase= Storage.java:355) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.P= igOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.P= igOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97) >> at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.wr= ite(ReduceTask.java:508) >> at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskI= nputOutputContext.java:80) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.P= igMapReduce$Reduce.runPipeline(PigMapReduce.java:436) >> ... 7 more >> >> >> Thoughts? >> >> >> >> -- >> Ari Rabkin asrabkin@gmail.com >> UC Berkeley Computer Science Department >> >> > > > > -- > Ari Rabkin asrabkin@gmail.com > UC Berkeley Computer Science Department > --_000_C94A23A8ED29eyangyahooinccom_--