hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jingguo yao <yaojing...@gmail.com>
Subject Re: What is the output format of org.apache.hadoop.examples.Join?
Date Thu, 28 Mar 2013 06:26:10 GMT
Yanbo:

Sorry for pasting the wrong result.

The output for joining a.txt, b.txt and c.txt is as follows (still not
the same produced by Chris):

AAAAAAAA        a0	[,,]
AAAAAAAA        b0	[,,]
AAAAAAAA        c0	[,,]
BBBBBBBB        a1	[,,]
BBBBBBBB        b1	[,,]
BBBBBBBB        b2	[,,]
BBBBBBBB        b3	[,,]
BBBBBBBB        c1	[,,]
CCCCCCCC        a2	[,,]
CCCCCCCC        a3	[,,]
DDDDDDDD        c2	[,,]
DDDDDDDD        c3	[,,]


On Thu, Mar 28, 2013 at 11:46 AM, Yanbo Liang <yanbohappy@gmail.com> wrote:
> Your output is only a.txt join b.txt.
> You need to joint c.txt continually.
>
> 2013/3/26 jingguo yao <yaojingguo@gmail.com>
>
>> I am reading the following mail:
>>
>> http://www.mail-archive.com/core-user@hadoop.apache.org/msg04066.html
>>
>> After running the following command (I am using Hadoop 1.0.4):
>>
>> bin/hadoop jar hadoop-examples-1.0.4.jar join \
>>    -inFormat org.apache.hadoop.mapred.KeyValueTextInputFormat \
>>    -outKey org.apache.hadoop.io.Text \
>>    -joinOp outer \
>>    join/a.txt join/b.txt join/c.txt joinout
>>
>>
>> Then I run "bin/hadoop fs -text joinout/part-00000". I see the following
>> result:
>>
>> AAAAAAAA        a0      [,]
>> AAAAAAAA        b0      [,]
>> BBBBBBBB        a1      [,]
>> BBBBBBBB        b1      [,]
>> BBBBBBBB        b2      [,]
>> BBBBBBBB        b3      [,]
>> CCCCCCCC        a2      [,]
>> CCCCCCCC        a3      [,]
>>
>> But Chris said that the result should be:
>>
>> AAAAAAAA        [a0,b0,c0]
>> BBBBBBBB        [a1,b1,c1]
>> BBBBBBBB        [a1,b2,c1]
>> BBBBBBBB        [a1,b3,c1]
>> CCCCCCCC        [a2,,]
>> CCCCCCCC        [a3,,]
>> DDDDDDDD        [,,c2]
>> DDDDDDDD        [,,c3]
>>
>> Is Join's output format changed for Hadoop 1.0.4?
>>
>>
>> --
>> Jingguo
>>



-- 
Jingguo

Mime
View raw message