hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From leibnitz <se3g2...@gmail.com>
Subject some questions about 'Hadoop:The.Definitive.Guide.'
Date Wed, 08 Sep 2010 18:32:43 GMT

hi,all:
when i study at chapter 8 of that book,i can't understand some sentences
which i have tried to find it's explanations in javadoc.they are:
a.Reduce-side joins,at page 236,it said:
"The reducer knows that it will receive the station record first, so it
extracts its name
from the value and writes it out as a part of every output record (Example
8-14)."
why does the station records will been  received at first always?

b.example 8-15,on page 237,a frag of codes:
conf.setOutputValueGroupingComparator(TextPair.FirstComparator.class);
i know it means that if the keys are equal,then they will be grouped by
frist key of Pair.but against the output,
011990-99999 SIHCCAJAVRI 0067011990999991950051507004+68750...
011990-99999 SIHCCAJAVRI 0043011990999991950051512004+68750...
the fist key of pair is '011990-99999',but why they are duplicated in
output?

thanks in advance!


-- 
View this message in context: http://lucene.472066.n3.nabble.com/some-questions-about-Hadoop-The-Definitive-Guide-tp1441455p1441455.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Mime
View raw message