hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: Container size configuration
Date Wed, 19 Jun 2013 17:46:41 GMT
That's correct. The previous behavior of the MR AM was tightly coupled to
the scheduler impl and thus fragile. The RM is supposed to not give a
container less than it needs because that's incorrect. It can always give
a container more than it needs based on its internal heuristics. Ideally
that should not be the case to prevent internal fragmentation.

Alejandro, after MAPREDUCE-5310 did we check that the MR AM works
correctly after making the M/R memory different from the normalized
values?

Bikas

-----Original Message-----
From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
Sent: Tuesday, June 18, 2013 10:59 AM
To: yarn-dev@hadoop.apache.org
Subject: Re: Container size configuration

Bobby,

With MAPREDUCE-5310 we removed normalization of resource request on the
MRAM side. This was done because the normalization is an implementation
detail of the RM scheduler.

IMO, if this is a problem for the MRAM as you suggest, then we should fix
the MRAM logic.

Note this may happen only the MR job specifies memory requirements for its
tasks that do not much with its normalize value.

Thanks.



On Tue, Jun 18, 2013 at 10:45 AM, Robert Evans <evans@yahoo-inc.com>
wrote:

> Even returning an over sized container can be very confusing for an
> application.  The MR AM will not handle it correctly.  If it sees a
> container returned that does not match exactly the priority and size
> it expects, I believe that container is thrown away.  We had deadlocks
> in the past where it somehow used a reducer container for a mapper and
> then never updated the reducer count to request a new one.  It is best
> for now to not mix the two, and we need to lock down/fix the semantics
> of what happens in those situations for a scheduler.
>
> --Bobby
>
> On 6/18/13 12:13 AM, "Bikas Saha" <bikas@hortonworks.com> wrote:
>
> >I think the API allows different size requests at the same priority.
> >The implementation of the scheduler drops the size information and
> >uses the last value set. We should probably at least change it to use
> >the largest value used so that users don't get containers that are too
small for them.
> >YARN-847 tracks this.
> >
> >Bikas
> >
> >-----Original Message-----
> >From: Robert Evans [mailto:evans@yahoo-inc.com]
> >Sent: Friday, June 14, 2013 7:09 AM
> >To: yarn-dev@hadoop.apache.org
> >Subject: Re: Container size configuration
> >
> >Is this specifically for YARN?  If so, yes you can do this, MR does
> >this for Maps vs Reduces.  The API right now requires that the
> >different sized containers have a different priority.
> >
> >
> http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-sit
> e/Wr
> >i
> >tingYarnApplications.html
> >
> >Shows how to make a resource request. It also shows how to make a
> >AllocateRequest.  If you put in multiple ResourceRequests into the
> >AllocateRequest it will allocate both of them.  But remember that
> >that the priority needs to be different, and the priority determines
> >the order in which the containers will be allocated to your
application.
> >
> >--Bobby
> >
> >On 6/13/13 10:41 AM, "Yuzhang Han" <yuzhanghan1982@gmail.com> wrote:
> >
> >>Hi,
> >>
> >>I am wondering if I can allocate different size of containers to the
> >>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 =
> >>Task2 = 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
> >>
> >>Yuzhang
>
>


--
Alejandro

Mime
View raw message