hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From psdc1978 <psdc1...@gmail.com>
Subject Re: Trying to relate a split file to a input file
Date Tue, 18 May 2010 20:57:26 GMT
I don't think that the workcount example uses FileSplit class. Only the
MultithreadedMapper class uses FileSplit and I can't find an example where
it's invoked.

Where is the setup() method?



On Tue, May 18, 2010 at 6:50 PM, Wilkes, Chris <cwilkes@gmail.com> wrote:

> In your setup() look at context.getInputSplit(), this will be a FileSplit
> in your case.   From there you can do a getPath() to see the both the
> directory structure and the split value.
>
>
> On May 18, 2010, at 10:01 AM, psdc1978 wrote:
>
>  Hi,
>>
>> I'm study the MapReduce code, and I've the following questions:
>>
>> 1 - I'm running the wordcount example. I've 3 txt files as input. Each txt
>> file is about 120Mb.
>>
>> During the execution of the map tasks, a number of map tasks will read the
>> txt files. Each file is divided in split files. I would like to know to each
>> txt file corresponds a split.
>> For example, for the A.txt file, it will be created 2 splits (split0 and
>> split1) of 64Mb each. I would like to know that split0 and split1 belongs to
>> A.txt.
>> Is it possible? If I've to do some code, is there any object that contains
>> this data?
>>
>> 2 -
>> The Job task uses a job.split file. What contains this file and what is
>> the purpose of this file?
>>
>> Thanks,
>>
>> --
>> PSC
>>
>
>


-- 
Pedro

Mime
View raw message