hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Costa <psdc1...@gmail.com>
Subject Re: Where the map task uses the set of locations?
Date Wed, 11 May 2011 17:01:11 GMT
Please, forget my question. I was looking to the wrong code.

On Wed, May 11, 2011 at 5:52 PM, Pedro Costa <psdc1978@gmail.com> wrote:
> Hi,
>
> I was looking to the mapred code, searching for the moment where the
> split location is passed to the MapTask, and I've found this line in
> TaskInProgress class.
> [code]
> t = new MapTask(jobFile, taskid, partition, splitClass, split,
> rawSplit.getFileName(), rawSplit.getLocations());
> [/code]
>
> The split variable is the split.
>
> [code]
>        BytesWritable split;
>                        if (!jobSetup && !jobCleanup) {
>                                splitClass = rawSplit.getClassName();
>                                split = rawSplit.getBytes();
>                        } else {
>                                split = new BytesWritable();
>                        }
> [/code]
>
> The "rawSplit.getFileName()" is the full URL to the split file
> (hdfs://chicon-7.fr:54310/user/xxx/gutenberg/A.txt), the locations are
> the servers where the split is ([chicon-7.fr, chinqchint-21.fr,
> chinqchint-38.fr]).
>
>
> 1 - Why during the creation of a MapTask is passed the split and the
> filename and the set of locations? If the split is passed, I deduce
> that the map task already contains the split bytes, that it will use.
> So, why not just pass the split, and ignore the the filename and the
> set of locations?
>
>
>
> Thanks
>
> --
> ---------------------------
> PSC
>



-- 
---------------------------
Pedro Sá da Costa

@: pcosta@lasige.di.fc.ul.pt
@: psdc1978@gmail.com

Mime
View raw message