hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Hadoop processing
Date Thu, 08 Nov 2012 15:05:36 GMT
Hello Andy,

     Just to add to what Mr. Jay has said, MR framework does its best to
run the map task on a node where the input data is present. Sometimes,
however, all the nodes(based on the replication factor) hosting the data
block for a map task’s input split don't have any free slots. In that case,
the job scheduler will look for a free map slot on a node in the same rack
as one of the blocks. Very occasionally even this is not possible, so an
off-rack node is used

    Mohammad Tariq

On Thu, Nov 8, 2012 at 8:19 PM, Jay Vyas <jayunit100@gmail.com> wrote:

> Hmm this is interesting.  I think that:
> 1) For the map phases, hadoop is smart enough to try to run mappers
> locally, but i think you could force these DNs to actively participate in a
> Mapper job by decreasing the size of input splits, which would allow for
> many more mappers, some of which would be forced to run on files which were
> not necessarily local - in this scenario, those DNs don't yet have any
> local files on them that would be used for the input.
> 2) For the reducer phases - since of course the reducers will be copying
> mapper outputs from all over the cluster, one would expect that your Data
> nodes would naturally take part in this portion of the task if the
> num.reducers parameter was specified.
> On Thu, Nov 8, 2012 at 9:35 AM, Kartashov, Andy <Andy.Kartashov@mpac.ca>wrote:
>>  Hadoopers,
>> “Hadoop ships the code to the data instead of sending the data to the
>> code.”
>> Say you added two DNs/TTs to the cluster. They have no data at this
>> point, i.e. you have not ran the balancer.
>> In view of the above quoted statement, will these two nodes not
>> participate in the MapReduce job until you balanced some data onto those
>> nodes? Please kindly elaborate.
>> Rgds,
>> AK47
>>  NOTICE: This e-mail message and any attachments are confidential,
>> subject to copyright and may be privileged. Any unauthorized use, copying
>> or disclosure is prohibited. If you are not the intended recipient, please
>> delete and contact the sender immediately. Please consider the environment
>> before printing this e-mail. AVIS : le présent courriel et toute pièce
>> jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur
>> et peuvent être couverts par le secret professionnel. Toute utilisation,
>> copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le
>> destinataire prévu de ce courriel, supprimez-le et contactez immédiatement
>> l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent
>> courriel
> --
> Jay Vyas
> http://jayunit100.blogspot.com

View raw message