hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yanbo Liang <yanboha...@gmail.com>
Subject Re: What is the output format of org.apache.hadoop.examples.Join?
Date Thu, 28 Mar 2013 03:46:05 GMT
Your output is only a.txt join b.txt.
You need to joint c.txt continually.

2013/3/26 jingguo yao <yaojingguo@gmail.com>

> I am reading the following mail:
>
> http://www.mail-archive.com/core-user@hadoop.apache.org/msg04066.html
>
> After running the following command (I am using Hadoop 1.0.4):
>
> bin/hadoop jar hadoop-examples-1.0.4.jar join \
>    -inFormat org.apache.hadoop.mapred.KeyValueTextInputFormat \
>    -outKey org.apache.hadoop.io.Text \
>    -joinOp outer \
>    join/a.txt join/b.txt join/c.txt joinout
>
>
> Then I run "bin/hadoop fs -text joinout/part-00000". I see the following
> result:
>
> AAAAAAAA        a0      [,]
> AAAAAAAA        b0      [,]
> BBBBBBBB        a1      [,]
> BBBBBBBB        b1      [,]
> BBBBBBBB        b2      [,]
> BBBBBBBB        b3      [,]
> CCCCCCCC        a2      [,]
> CCCCCCCC        a3      [,]
>
> But Chris said that the result should be:
>
> AAAAAAAA        [a0,b0,c0]
> BBBBBBBB        [a1,b1,c1]
> BBBBBBBB        [a1,b2,c1]
> BBBBBBBB        [a1,b3,c1]
> CCCCCCCC        [a2,,]
> CCCCCCCC        [a3,,]
> DDDDDDDD        [,,c2]
> DDDDDDDD        [,,c3]
>
> Is Join's output format changed for Hadoop 1.0.4?
>
>
> --
> Jingguo
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message