hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saptarshi Guha <saptarshi.g...@gmail.com>
Subject Re: The name of the current input file during a map
Date Thu, 26 Nov 2009 07:27:31 GMT
Hello again,
I'm using Hadoop 0.21 and its context object  e.g

 public void setup(Context context) {
	Configuration cfg = context.getConfiguration();
System.out.println("mapred.input.file="+cfg.get("mapred.input.file"));

displays null, so maybe this fell out by mistake in the api change?
Regards
Saptarshi


On Thu, Nov 26, 2009 at 2:13 AM, Saptarshi Guha
<saptarshi.guha@gmail.com> wrote:
> Thank you.
> Regards
> Saptarshi
>
> On Thu, Nov 26, 2009 at 2:10 AM, Amogh Vasekar <amogh@yahoo-inc.com> wrote:
>> Conf.get(map.input.file) is what you need.
>>
>> Amogh
>>
>>
>> On 11/26/09 12:35 PM, "Saptarshi Guha" <saptarshi.guha@gmail.com> wrote:
>>
>> Hello,
>> I have a set of input files part-r-* which I will pass through another
>> map(no reduce).  the part-r-* files consist of key, values, keys being
>> small, values fairly large(MB's)
>>
>> I would like to index these, i.e run a map, whose output is key and
>> /filename/ i.e to which part-r-* file the particular key belongs, so
>> that if i need them again I can just access that file.
>>
>> Q: In the map stage,how do I retrieve the name of the file being
>> processed?  I'd rather not use the MapFileOutputFormat.
>>
>> Hadoop 0.21
>>
>> Regards
>> Saptarshi
>>
>>
>

Mime
View raw message