hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sigurd Spieckermann <sigurd.spieckerm...@gmail.com>
Subject Re: how to differentiate which input directory current record comes from?
Date Sat, 15 Dec 2012 10:17:39 GMT
If you use the new API, you can access the MapContext object in the setup
method of the mapper. Then, you can get the input split with
MapContext#getInputSplit(), cast it to FileSplit and obtain the path of the
file the current split is part of through the FileSplit#getPath() method.
All records of the mapper instance will be part of that split so you only
need to get the information once per map task in the setup method.
Am 14.12.2012 19:46 schrieb "Xiaowei Li" <selluck@gmail.com>:

> hi,
>
> my MR job has multiple inputs, and I wanna how to differentiate which
> input directory current row/record comes from in my mapper?
>
> thanks!
> -xw
>

Mime
View raw message