hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Olimpiati <markq2...@gmail.com>
Subject Re: assign tasks to specific nodes
Date Wed, 11 Sep 2013 23:08:04 GMT
Hi Vinod, I had the node assignment at first but in my second email I
explained how I want to change the order of data partition execution. The
default is run tasks based on the *size *of the assigned partition to it.
Now I want to run tasks such that specific order of partitions is to be

Eg. First assume input is directory Houses/ with files {Villa, Apartment,
Room} such that file "Villa" is larger in size than "Apartments" than

The default hadoop would run :
map1 --> Villa
map2 --> Apartment
map3 --> Room

I want to assign priorities to the *data partitions* such that Apartment=1,
Room=2, Villa=3 then the scheduler will run the following in this order:
map1 --> Apartment
map2 --> Room
map3 --> Villa

My question is that possible? Notice this is regardless of the assigned
Thank you,

On Wed, Sep 11, 2013 at 10:45 AM, Vinod Kumar Vavilapalli <
vinodkv@apache.org> wrote:

> I assume you are talking about MapReduce. And 1.x release or 2.x?
> In either of the releases, this cannot be done directly.
> In 1.x, the framework doesn't expose a feature like this as it is a shared
> service, and if enough jobs flock to a node, it will lead to utilization
> and failure handling issues.
> In Hadoop 2 YARN, the platform does expose this functionality. But
> MapReduce framework doesn't yet expose this functionality to the end users.
> What exactly is your use case? Why are some nodes of higher priority than
> others?
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
> On Sep 11, 2013, at 10:09 AM, Mark Olimpiati wrote:
> Thanks for replying Rev, but the link is talking about reducers which
> seems to be like a similar case but what if I assigned priorities to the
> data partitions (eg. partition B=1, partition C=2, partition A=3,...) such
> that first map task is assigned partition B to run first. Then second map
> is given partition C, .. etc. This is instead of assigning based on
> partition size. Is that possible?
> Thanks,
> Mark
> On Mon, Sep 9, 2013 at 11:17 AM, Ravi Prakash <ravihoo@ymail.com> wrote:
>> http://lucene.472066.n3.nabble.com/Assigning-reduce-tasks-to-specific-nodes-td4022832.html
>>   ------------------------------
>>  *From:* Mark Olimpiati <markq2011@gmail.com>
>> *To:* user@hadoop.apache.org
>> *Sent:* Friday, September 6, 2013 1:47 PM
>> *Subject:* assign tasks to specific nodes
>> Hi guys,
>>    I'm wondering if there is a way for me to assign tasks to specific
>> machines or at least assign priorities to the tasks to be executed in that
>> order. Any suggestions?
>> Thanks,
>> Mark
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

View raw message