hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wilkes, Chris" <cwil...@gmail.com>
Subject Re: Trying to relate a split file to a input file
Date Tue, 18 May 2010 17:50:06 GMT
In your setup() look at context.getInputSplit(), this will be a  
FileSplit in your case.   From there you can do a getPath() to see the  
both the directory structure and the split value.

On May 18, 2010, at 10:01 AM, psdc1978 wrote:

> Hi,
>
> I'm study the MapReduce code, and I've the following questions:
>
> 1 - I'm running the wordcount example. I've 3 txt files as input.  
> Each txt file is about 120Mb.
>
> During the execution of the map tasks, a number of map tasks will  
> read the txt files. Each file is divided in split files. I would  
> like to know to each txt file corresponds a split.
> For example, for the A.txt file, it will be created 2 splits (split0  
> and split1) of 64Mb each. I would like to know that split0 and  
> split1 belongs to A.txt.
> Is it possible? If I've to do some code, is there any object that  
> contains this data?
>
> 2 -
> The Job task uses a job.split file. What contains this file and what  
> is the purpose of this file?
>
> Thanks,
>
> -- 
> PSC


Mime
View raw message