Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 48934 invoked from network); 9 Nov 2007 17:23:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Nov 2007 17:23:37 -0000 Received: (qmail 85442 invoked by uid 500); 9 Nov 2007 17:23:19 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 85406 invoked by uid 500); 9 Nov 2007 17:23:19 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 85397 invoked by uid 99); 9 Nov 2007 17:23:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Nov 2007 09:23:19 -0800 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [63.203.238.117] (HELO dns.duboce.net) (63.203.238.117) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Nov 2007 17:24:00 +0000 Received: by dns.duboce.net (Postfix, from userid 1008) id 56ABAC518; Fri, 9 Nov 2007 07:53:44 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-26) on dns.duboce.net X-Spam-Level: Received: from durruti.local (durruti.desk.hq.powerset.com [208.84.6.146]) by dns.duboce.net (Postfix) with ESMTP id 78B96C256 for ; Fri, 9 Nov 2007 07:53:39 -0800 (PST) Message-ID: <47349718.30407@duboce.net> Date: Fri, 09 Nov 2007 09:21:28 -0800 From: Michael Stack User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: hadoop-user@lucene.apache.org Subject: Re: Excpetion when combining Hadoop MapReduce with HBase References: <7c962aed0711082056o7178863qdec2f649042e84c@mail.gmail.com> <47347D5F.5010007@gmail.com> In-Reply-To: <47347D5F.5010007@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00, FORGED_RCVD_HELO autolearn=ham version=3.1.4 Lets use HADOOP-2179 for figuring out whats going on Holger. Would you mind uploading your configuration files -- both hadoop and hbase -- and put a link there to the data you are using so I can duplicate your setup locally? Thanks, St.Ack Holger Stenzhorn wrote: > Hi, > > During my latest "experiments" I was always using the local mode for > Hadoop/Hbase. > In the following I just give you the respective Reducer classes that I > use for testing. > Employing the... > - "TestFileReducer" the MapReduce job completes flawlessly. > - "TestBaseReducer" the job crashes (see log below and attached with > DEBUG turned on) > > Cheers, > Holger > > Code: > ------ > > public static class TestFileReducer extends MapReduceBase > implements Reducer { > public void reduce(Text key, Iterator values, > OutputCollector output, > Reporter reporter) throws IOException { > StringBuilder builder = new StringBuilder(); > while (values.hasNext()) { > builder.append(values.next() + "\n"); > } > output.collect(key, new Text(builder.toString())); > } > } > > public static class TestBaseReducer extends MapReduceBase > implements Reducer { > public void reduce(Text key, Iterator values, > OutputCollector output, > Reporter reporter) throws IOException { > StringBuilder builder = new StringBuilder(); > while (values.hasNext()) { > builder.append(values.next() + "\n"); > } > MapWritable value = new MapWritable(); > value.put(new Text("triples:" + string.hashCode()), new > ImmutableBytesWritable(builder.toString().getBytes())); > output.collect(key, value); > } > } > > > > Test case log: > -------------- > > 07/11/09 15:17:22 INFO mapred.JobClient: map 100% reduce 83% > 07/11/09 15:17:25 INFO mapred.LocalJobRunner: reduce > reduce > 07/11/09 15:17:28 INFO mapred.LocalJobRunner: reduce > reduce > 07/11/09 15:17:31 INFO mapred.LocalJobRunner: reduce > reduce > 07/11/09 15:17:34 INFO mapred.LocalJobRunner: reduce > reduce > 07/11/09 15:17:37 INFO mapred.LocalJobRunner: reduce > reduce > 07/11/09 15:22:24 WARN mapred.LocalJobRunner: job_local_1 > java.net.SocketTimeoutException: timed out waiting for rpc response > at org.apache.hadoop.ipc.Client.call(Client.java:484) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) > at $Proxy1.batchUpdate(Unknown Source) > at org.apache.hadoop.hbase.HTable.commit(HTable.java:724) > at org.apache.hadoop.hbase.HTable.commit(HTable.java:701) > at > org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:89) > > at > org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:63) > > at > org.apache.hadoop.mapred.ReduceTask$2.collect(ReduceTask.java:308) > at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:78) > at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:52) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:326) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:164) > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831) > at TriplesTest.run(TriplesTest.java:181) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at TriplesTest.main(TriplesTest.java:187) > > > Master log: > ----------- > > Attached as GZIP... > > Cheers, > Holger > > stack wrote: >> Holger Stenzhorn wrote: >> >>> Since I am just a beginner at Hadoop and Hbase I cannot really tell >>> whether an exception is a "real" one or just a "hint" - but >>> exceptions always look scary... :-) >>> >> Yeah. You might add your POV to the issue. >> >> >>> Anyways, when I did cut the allowed heap for the server and the test >>> class both down to 2GB. In this case I just get following (not too >>> optimistic) results... >>> Well, I post all the log I think that might be necessary since I >>> cannot say which exception is important and which one not. >>> >> Looks like your poor old mapreduce job failed when it tried to write a >> record to hbase. >> >> Looking at the master log, whats odd is that the catalog .META. table >> has no mention of the 'triples' table. Its been created? (It may not >> be showing because you are not running with logging at DEBUG level. >> As of yesterday or so you can set DEBUG level from the UI by browsing >> to 'Log Level' servlet at http://MASTER_HOST:PORT -- usually >> http://localhost:60000/ -- and set the package >> 'org.apache.hadoop.hbase' to DEBUG). >> >> But then you OOME. >> >> You might try outputting back to local filesystem rather than to HDFS >> -- use something like TextOutputFormat instead of TableOutputFormat. >> If that works, then there is an issue w/ writing output to hbase. >> Please open a JIRA, paste your MR program and lets figure a way to get >> the data file across. >> >> Thanks for your patience H, >> St.Ack >> >