hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Stack <st...@duboce.net>
Subject Re: Excpetion when combining Hadoop MapReduce with HBase
Date Fri, 09 Nov 2007 17:21:28 GMT
Lets use HADOOP-2179 for figuring out whats going on Holger.  Would you 
mind uploading your configuration files -- both hadoop and hbase -- and 
put a link there to the data you are using so I can duplicate your setup 
locally?

Thanks,
St.Ack


Holger Stenzhorn wrote:
> Hi,
>
> During my latest "experiments" I was always using the local mode for 
> Hadoop/Hbase.
> In the following I just give you the respective Reducer classes that I 
> use for testing.
> Employing the...
> - "TestFileReducer" the MapReduce job completes flawlessly.
> - "TestBaseReducer" the job crashes (see log below and attached with 
> DEBUG turned on)
>
> Cheers,
> Holger
>
> Code:
> ------
>
>  public static class TestFileReducer extends MapReduceBase
>    implements Reducer<Text, Text, Text, Text> {
>      public void reduce(Text key, Iterator<Text> values,
>                       OutputCollector<Text, Text> output,
>                       Reporter reporter) throws IOException {        
> StringBuilder builder = new StringBuilder();
>      while (values.hasNext()) {
>        builder.append(values.next() + "\n");
>      }
>      output.collect(key, new Text(builder.toString()));
>    }
>  }
>
>  public static class TestBaseReducer extends MapReduceBase
>    implements Reducer<Text, Text, Text, MapWritable> {
>      public void reduce(Text key, Iterator<Text> values,
>                       OutputCollector<Text, MapWritable> output,
>                       Reporter reporter) throws IOException {
>      StringBuilder builder = new StringBuilder();
>      while (values.hasNext()) {
>        builder.append(values.next() + "\n");
>      }
>      MapWritable value = new MapWritable();
>      value.put(new Text("triples:" + string.hashCode()), new 
> ImmutableBytesWritable(builder.toString().getBytes()));          
> output.collect(key, value);
>    }
>  }
>
>
>
> Test case log:
> --------------
>
> 07/11/09 15:17:22 INFO mapred.JobClient:  map 100% reduce 83%
> 07/11/09 15:17:25 INFO mapred.LocalJobRunner: reduce > reduce
> 07/11/09 15:17:28 INFO mapred.LocalJobRunner: reduce > reduce
> 07/11/09 15:17:31 INFO mapred.LocalJobRunner: reduce > reduce
> 07/11/09 15:17:34 INFO mapred.LocalJobRunner: reduce > reduce
> 07/11/09 15:17:37 INFO mapred.LocalJobRunner: reduce > reduce
> 07/11/09 15:22:24 WARN mapred.LocalJobRunner: job_local_1
> java.net.SocketTimeoutException: timed out waiting for rpc response
>        at org.apache.hadoop.ipc.Client.call(Client.java:484)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>        at $Proxy1.batchUpdate(Unknown Source)
>        at org.apache.hadoop.hbase.HTable.commit(HTable.java:724)
>        at org.apache.hadoop.hbase.HTable.commit(HTable.java:701)
>        at 
> org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:89)

>
>        at 
> org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:63)

>
>        at 
> org.apache.hadoop.mapred.ReduceTask$2.collect(ReduceTask.java:308)
>        at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:78)
>        at TriplesTest$TestBaseReducer.reduce(TriplesTest.java:52)
>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:326)
>        at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:164)
> Exception in thread "main" java.io.IOException: Job failed!
>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831)
>        at TriplesTest.run(TriplesTest.java:181)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at TriplesTest.main(TriplesTest.java:187)
>
>
> Master log:
> -----------
>
> Attached as GZIP...
>
> Cheers,
> Holger
>
> stack wrote:
>> Holger Stenzhorn wrote:
>>  
>>> Since I am just a beginner at Hadoop and Hbase I cannot really tell 
>>> whether an exception is a "real" one or just a "hint" - but 
>>> exceptions always look scary... :-)
>>>     
>> Yeah.  You might add your POV to the issue.
>>
>>  
>>> Anyways, when I did cut the allowed heap for the server and the test 
>>> class both down to 2GB. In this case I just get following (not too 
>>> optimistic) results...
>>> Well, I post all the log I think that might be necessary since I 
>>> cannot say which exception is important and which one not.
>>>     
>> Looks like your poor old mapreduce job failed when it tried to write a
>> record to hbase.
>>
>> Looking at the master log, whats odd is that the catalog .META. table
>> has no mention of the 'triples' table.  Its been created?  (It may not
>> be showing because you are not running with logging at DEBUG level.
>> As of yesterday or so you can set DEBUG level from the UI by browsing
>> to 'Log Level' servlet at http://MASTER_HOST:PORT -- usually
>> http://localhost:60000/ -- and set the package
>> 'org.apache.hadoop.hbase' to DEBUG).
>>
>> But then you OOME.
>>
>> You might try outputting back to local filesystem rather than to HDFS
>> -- use something like TextOutputFormat instead of TableOutputFormat.
>> If that works, then there is an issue w/ writing output to hbase.
>> Please open a JIRA, paste your MR program and lets figure a way to get
>> the data file across.
>>
>> Thanks for your patience H,
>> St.Ack
>>   
>


Mime
View raw message