hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yanbo Liang <yanboha...@gmail.com>
Subject Re: What is the output format of org.apache.hadoop.examples.Join?
Date Mon, 01 Apr 2013 11:06:50 GMT
You can give the detail information about your running parameters, hadoop
version, etc.
>From the principle and source code, you output is not reasonable.
The reduce stage of MR will merge the value to TupleWritable.


2013/3/28 jingguo yao <yaojingguo@gmail.com>

> Yanbo:
>
> Sorry for pasting the wrong result.
>
> The output for joining a.txt, b.txt and c.txt is as follows (still not
> the same produced by Chris):
>
> AAAAAAAA        a0      [,,]
> AAAAAAAA        b0      [,,]
> AAAAAAAA        c0      [,,]
> BBBBBBBB        a1      [,,]
> BBBBBBBB        b1      [,,]
> BBBBBBBB        b2      [,,]
> BBBBBBBB        b3      [,,]
> BBBBBBBB        c1      [,,]
> CCCCCCCC        a2      [,,]
> CCCCCCCC        a3      [,,]
> DDDDDDDD        c2      [,,]
> DDDDDDDD        c3      [,,]
>
>
> On Thu, Mar 28, 2013 at 11:46 AM, Yanbo Liang <yanbohappy@gmail.com>
> wrote:
> > Your output is only a.txt join b.txt.
> > You need to joint c.txt continually.
> >
> > 2013/3/26 jingguo yao <yaojingguo@gmail.com>
> >
> >> I am reading the following mail:
> >>
> >> http://www.mail-archive.com/core-user@hadoop.apache.org/msg04066.html
> >>
> >> After running the following command (I am using Hadoop 1.0.4):
> >>
> >> bin/hadoop jar hadoop-examples-1.0.4.jar join \
> >>    -inFormat org.apache.hadoop.mapred.KeyValueTextInputFormat \
> >>    -outKey org.apache.hadoop.io.Text \
> >>    -joinOp outer \
> >>    join/a.txt join/b.txt join/c.txt joinout
> >>
> >>
> >> Then I run "bin/hadoop fs -text joinout/part-00000". I see the following
> >> result:
> >>
> >> AAAAAAAA        a0      [,]
> >> AAAAAAAA        b0      [,]
> >> BBBBBBBB        a1      [,]
> >> BBBBBBBB        b1      [,]
> >> BBBBBBBB        b2      [,]
> >> BBBBBBBB        b3      [,]
> >> CCCCCCCC        a2      [,]
> >> CCCCCCCC        a3      [,]
> >>
> >> But Chris said that the result should be:
> >>
> >> AAAAAAAA        [a0,b0,c0]
> >> BBBBBBBB        [a1,b1,c1]
> >> BBBBBBBB        [a1,b2,c1]
> >> BBBBBBBB        [a1,b3,c1]
> >> CCCCCCCC        [a2,,]
> >> CCCCCCCC        [a3,,]
> >> DDDDDDDD        [,,c2]
> >> DDDDDDDD        [,,c3]
> >>
> >> Is Join's output format changed for Hadoop 1.0.4?
> >>
> >>
> >> --
> >> Jingguo
> >>
>
>
>
> --
> Jingguo
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message