hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Hung <YTHu...@winbond.com>
Subject RE: fair scheduler not working as intended
Date Wed, 13 Aug 2014 06:01:02 GMT
Hi Yehia,

Oh? I thought that by using maxResources = 15360 mb (3072 mb * 5), vcores = 5, and maxMaps
= 5, I already restricting the job to only use 5 maps at max.

The reason is my long run job have 841 maps, and each map will process data for almost 2 hours.
In the meantime there will be some short jobs that only need couple of minutes to complete.
Hence why I use fair scheduler to split resources into 2 groups, one default and other one
longrun.
I want to make sure there always an available resources ready to be used by short jobs.

If your explanation is true, then current fair scheduler behavior is not what I wanted.
So is there any other ways to setup YARN resources to accommodate the short / long run jobs?
Or do I need to create 2 separate YARN cluster? (I have been thinking about this approach)

Best regards,
Henry

From: Yehia Elshater [mailto:y.z.elshater@gmail.com]
Sent: Wednesday, August 13, 2014 11:27 AM
To: user@hadoop.apache.org
Subject: Re: fair scheduler not working as intended

Hi Henry,

Are there any applications (on different queues rather than longrun queue) are running in
the same time ? I think FairScheduler is going to assign more resources to your "longrun"
as long as there no other applications are running in the other queues.

Thanks
Yehia

On 12 August 2014 20:30, Henry Hung <YTHung1@winbond.com<mailto:YTHung1@winbond.com>>
wrote:
Hi Everyone,

I’m using Hadoop-2.2.0 with fair scheduler in my YARN cluster, but something is wrong with
the fair scheduler.

Here is my fair-scheduler.xml looks like:

<allocations>
  <queue name="longrun">
    <maxResources>15360 mb, 5 vcores</maxResources>
    <weight>0.5</weight>
    <minMaps>2</minMaps>
    <maxMaps>5</maxMaps>
    <minReduces>1</minReduces>
  </queue>
</allocations>

I create a “longrun” queue to ensure that huge MR application can only use 5 resources.
My YARN setup for each resource memory is 3072 MB:

  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>3072</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>3072</value>
  </property>

When the huge application started, it works just fine and scheduler restrict it to only run
5 maps in parallel.
But after running for some time, the application run 10 maps in parallel.
The scheduler page show that the “longrun” queue used 66%, exceed the fair share 30%.

Can anyone tell me why the application can get more than it deserved?
Is the problem with my configuration? Or there is a bug?

Best regards,
Henry Hung

________________________________
The privileged confidential information contained in this email is intended for use only by
the addressees as indicated by the original sender of this email. If you are not the addressee
indicated in this email or are not responsible for delivery of the email to such a person,
please kindly reply to the sender indicating this fact and delete all copies of it from your
computer and network server immediately. Your cooperation is highly appreciated. It is advised
that any unauthorized use of confidential information of Winbond is strictly prohibited; and
any information in this email irrelevant to the official business of Winbond shall be deemed
as neither given nor endorsed by Winbond.


________________________________
The privileged confidential information contained in this email is intended for use only by
the addressees as indicated by the original sender of this email. If you are not the addressee
indicated in this email or are not responsible for delivery of the email to such a person,
please kindly reply to the sender indicating this fact and delete all copies of it from your
computer and network server immediately. Your cooperation is highly appreciated. It is advised
that any unauthorized use of confidential information of Winbond is strictly prohibited; and
any information in this email irrelevant to the official business of Winbond shall be deemed
as neither given nor endorsed by Winbond.
Mime
View raw message