crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <mkwhita...@gmail.com>
Subject Re: Retrieving Input File Name with MRPipeline
Date Mon, 22 Jun 2015 18:41:13 GMT
The DoFn should give you access to the TaskInputOutputContext[1] which
should contain that information.  I believe the context then should hold
the file as a config like "MAP_INPUT_FILE".  I haven't really tested this
out so definitely verify.


[1] -
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/TaskInputOutputContext.html

On Mon, Jun 22, 2015 at 1:28 PM, David Ortiz <dpo5003@gmail.com> wrote:

> Hello,
>
>       Is there a way in my crunch pipeline that I can retrieve the file
> name of the input file for my MapFn?  This function is definitely applied
> as a Mapper, so I think it should be possible, just having some difficulty
> working through the exact method of doing so.
>
> Thanks,
>       Dave
>

Mime
View raw message