hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ondrej Holecek <ond...@holecek.eu>
Subject Re: mapreduce streaming with hbase as a source
Date Sat, 19 Feb 2011 13:03:14 GMT
Thank you, I've spend a lot of time with debuging but didn't notice this typo :(

Now it works, but I don't understand one thing: On stdin I get this:

72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
row1/family1:b/1298037744658/Put/vlen=1, row1/family1:c/1298037748020/Put/vlen=1}
72 6f 77 32     keyvalues={row2/family1:a/1298037755440/Put/vlen=2,
row2/family1:b/1298037758241/Put/vlen=2, row2/family1:c/1298037761198/Put/vlen=2}
72 6f 77 33     keyvalues={row3/family1:a/1298037767127/Put/vlen=3,
row3/family1:b/1298037770111/Put/vlen=3, row3/family1:c/1298037774954/Put/vlen=3}

I see there is everything but value. What should I do to get value on stdin too?

Ondrej

On 02/18/11 20:01, Jean-Daniel Cryans wrote:
> You have a typo, it's hbase.mapred.tablecolumns not hbase.mapred.tablecolumn
> 
> J-D
> 
> On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <ondrej@holecek.eu> wrote:
>> Hello,
>>
>> I'm testing hadoop and hbase, I can run mapreduce streaming or pipes jobs agains
text files on
>> hadoop, but I have a problem when I try to run the same job against hbase table.
>>
>> The table looks like this:
>> hbase(main):015:0> scan 'table1'
>> ROW                                                COLUMN+CELL
>>
>>  row1                                              column=family1:a, timestamp=1298037737154,
>> value=1
>>
>>  row1                                              column=family1:b, timestamp=1298037744658,
>> value=2
>>
>>  row1                                              column=family1:c, timestamp=1298037748020,
>> value=3
>>
>>  row2                                              column=family1:a, timestamp=1298037755440,
>> value=11
>>
>>  row2                                              column=family1:b, timestamp=1298037758241,
>> value=22
>>
>>  row2                                              column=family1:c, timestamp=1298037761198,
>> value=33
>>
>>  row3                                              column=family1:a, timestamp=1298037767127,
>> value=111
>>
>>  row3                                              column=family1:b, timestamp=1298037770111,
>> value=222
>>
>>  row3                                              column=family1:c, timestamp=1298037774954,
>> value=333
>>
>> 3 row(s) in 0.0240 seconds
>>
>>
>> And command I use, with the exception I get:
>>
>> # hadoop jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
>> hbase.mapred.tablecolumn=family1:  -input table1 -output /mtestout45 -mapper test-map
>> -numReduceTasks 1 -reducer test-reduce -inputformat org.apache.hadoop.hbase.mapred.TableInputFormat
>>
>> packageJobJar: [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
>> /tmp/streamjob8218197708173702571.jar tmpDir=null
>> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
>> hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
>> Exception in thread "main" java.lang.RuntimeException: Error in configuring object
>>        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>        at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
>>        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
>>        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
>>        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
>>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:396)
>>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>>        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
>>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
>>        at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
>>        at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>        at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>> Caused by: java.lang.reflect.InvocationTargetException
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>>        ... 23 more
>> Caused by: java.lang.NullPointerException
>>        at org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
>>        ... 28 more
>>
>>
>> Can anyone tell me what I am doing wrong?
>>
>> Regards,
>> Ondrej
>>


Mime
View raw message