hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShengChang Gu <gushengch...@gmail.com>
Subject Re: mapreduce streaming with hbase as a source
Date Sat, 19 Feb 2011 15:41:33 GMT
By default, the prefix of a line
up to the first tab character is the key and the rest of the line (excluding
the tab character)
will be the value. If there is no tab character in the line, then entire
line is considered as key
and the value is null. However, this can be customized, Use:

-D stream.map.output.field.separator=.
-D stream.num.map.output.key.fields=4

2011/2/19 Ondrej Holecek <ondrej@holecek.eu>

> Thank you, I've spend a lot of time with debuging but didn't notice this
> typo :(
>
> Now it works, but I don't understand one thing: On stdin I get this:
>
> 72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
> row1/family1:b/1298037744658/Put/vlen=1,
> row1/family1:c/1298037748020/Put/vlen=1}
> 72 6f 77 32     keyvalues={row2/family1:a/1298037755440/Put/vlen=2,
> row2/family1:b/1298037758241/Put/vlen=2,
> row2/family1:c/1298037761198/Put/vlen=2}
> 72 6f 77 33     keyvalues={row3/family1:a/1298037767127/Put/vlen=3,
> row3/family1:b/1298037770111/Put/vlen=3,
> row3/family1:c/1298037774954/Put/vlen=3}
>
> I see there is everything but value. What should I do to get value on stdin
> too?
>
> Ondrej
>
> On 02/18/11 20:01, Jean-Daniel Cryans wrote:
> > You have a typo, it's hbase.mapred.tablecolumns not
> hbase.mapred.tablecolumn
> >
> > J-D
> >
> > On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <ondrej@holecek.eu>
> wrote:
> >> Hello,
> >>
> >> I'm testing hadoop and hbase, I can run mapreduce streaming or pipes
> jobs agains text files on
> >> hadoop, but I have a problem when I try to run the same job against
> hbase table.
> >>
> >> The table looks like this:
> >> hbase(main):015:0> scan 'table1'
> >> ROW                                                COLUMN+CELL
> >>
> >>  row1                                              column=family1:a,
> timestamp=1298037737154,
> >> value=1
> >>
> >>  row1                                              column=family1:b,
> timestamp=1298037744658,
> >> value=2
> >>
> >>  row1                                              column=family1:c,
> timestamp=1298037748020,
> >> value=3
> >>
> >>  row2                                              column=family1:a,
> timestamp=1298037755440,
> >> value=11
> >>
> >>  row2                                              column=family1:b,
> timestamp=1298037758241,
> >> value=22
> >>
> >>  row2                                              column=family1:c,
> timestamp=1298037761198,
> >> value=33
> >>
> >>  row3                                              column=family1:a,
> timestamp=1298037767127,
> >> value=111
> >>
> >>  row3                                              column=family1:b,
> timestamp=1298037770111,
> >> value=222
> >>
> >>  row3                                              column=family1:c,
> timestamp=1298037774954,
> >> value=333
> >>
> >> 3 row(s) in 0.0240 seconds
> >>
> >>
> >> And command I use, with the exception I get:
> >>
> >> # hadoop jar
> /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
> >> hbase.mapred.tablecolumn=family1:  -input table1 -output /mtestout45
> -mapper test-map
> >> -numReduceTasks 1 -reducer test-reduce -inputformat
> org.apache.hadoop.hbase.mapred.TableInputFormat
> >>
> >> packageJobJar:
> [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
> >> /tmp/streamjob8218197708173702571.jar tmpDir=null
> >> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
> >> hdfs://
> oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
> >> Exception in thread "main" java.lang.RuntimeException: Error in
> configuring object
> >>        at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> >>        at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> >>        at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >>        at
> org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
> >>        at
> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
> >>        at
> org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
> >>        at
> org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
> >>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
> >>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
> >>        at java.security.AccessController.doPrivileged(Native Method)
> >>        at javax.security.auth.Subject.doAs(Subject.java:396)
> >>        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
> >>        at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
> >>        at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
> >>        at
> org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
> >>        at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
> >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>        at
> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
> >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>        at java.lang.reflect.Method.invoke(Method.java:597)
> >>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >> Caused by: java.lang.reflect.InvocationTargetException
> >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>        at java.lang.reflect.Method.invoke(Method.java:597)
> >>        at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> >>        ... 23 more
> >> Caused by: java.lang.NullPointerException
> >>        at
> org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
> >>        ... 28 more
> >>
> >>
> >> Can anyone tell me what I am doing wrong?
> >>
> >> Regards,
> >> Ondrej
> >>
>
>


-- 
阿昌

Mime
View raw message