hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vamshi Krishna <vamshi2...@gmail.com>
Subject how to specify key and value for an input to mapreduce job
Date Tue, 14 Feb 2012 14:58:59 GMT
Hi all,
i have a job which read all the rows from a hbase table and had written
them to a location in dfs i.e  /user/HSOP. HSOP is a folder which has 9
files each having their content as
00015DEGgJ    -HM
00016Pc4Tl    -HM
0001H0iImI    -HM
0001Oyb0Ju    -HM
0001hwBEOr    -HM
0002Qx2Uj9    -HM
0002jCs6gr    -HM
0003PMcWRa    -HM
000488xKIE    -HM

Both 1st and second columns are of Text type as specified in the first
job's outputformat class.

Now i want onemore job to read all these files as input and and treat first
column  element as "key" and second column  element as "value". For that i
tried starting one job by specifying  line
job.getConfiguration().set("key.value.separator.in.input.line", "-");

In the reduce() method i had context.write(key, value);  key is
Longwritable and value is Text. But if i see the output of this job, i had
seen the format like,

46    0002mCjpo9    -HM
253    000AxT9LSA    -HM
460    000FYtnxiB    -HM
667    000WNVBo9N    -HM
874    000dQiseKz    -HM

But i don't want first column to be added to each row. Please how to do
that,
somebody help.

Mime
View raw message