hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 麦树荣 <shurong....@qunar.com>
Subject 答复: problems of FairScheduler in hadoop2.2.0
Date Wed, 27 Nov 2013 09:28:22 GMT
Hi,

sorry, I complement some information.

The hadoop 2.2.0 had been running normally for some days since I start up the hadoop server.
I can run jobs  without any problems.
Today suddenly the jobs cannot run and all the jobs’ status were keeping “submitted”
after submitting.
There are 3 slavers and every slave has 32G memory and 24 cpus.

The contents of my fair-scheduler.xml is as follows:

<?xml version="1.0"?>
<allocations>
    <queue name="root">
    <minResources>10000mb,10vcores</minResources>
    <maxResources>90000mb,100vcores</maxResources>
    <maxRunningApps>50</maxRunningApps>
    <weight>2.0</weight>
    <schedulingMode>fair</schedulingMode>
    <aclSubmitApps> </aclSubmitApps>
    <aclAdministerApps> </aclAdministerApps>
        <queue name="queue1">
                <minResources>10000mb,10vcores</minResources>
                <maxResources>30000mb,30vcores</maxResources>
                <maxRunningApps>10</maxRunningApps>
                <weight>2.0</weight>
                <schedulingMode>fair</schedulingMode>
                <aclAdministerApps>xxx1,xxx2 admins</aclAdministerApps>
                <aclSubmitApps>xxx1,xxx2,xxx3 datadev</aclSubmitApps>
        </queue>
        <queue name="queue2">
                <minResources>10000mb,10vcores</minResources>
                <maxResources>30000mb,30vcores</maxResources>
                <maxRunningApps>10</maxRunningApps>
                <weight>2.0</weight>
                <schedulingMode>fair</schedulingMode>
                <aclAdministerApps>datadev admins</aclAdministerApps>
                <aclSubmitApps>xxx1 datadev</aclSubmitApps>
        </queue>
        <queue name="queue3">
                <minResources>5000mb,5vcores</minResources>
                <maxResources>10000mb,10vcores</maxResources>
                <maxRunningApps>10</maxRunningApps>
                <weight>2.0</weight>
                <schedulingMode>fair</schedulingMode>
                <aclAdministerApps>datadev admins</aclAdministerApps>
                <aclSubmitApps>xxx1,xxx2 datadev</aclSubmitApps>
        </queue>
        <queue name="default">
                <minResources>10000mb,10vcores</minResources>
                <maxResources>30000mb,30vcores</maxResources>
                <maxRunningApps>10</maxRunningApps>
                <weight>2.0</weight>
                <schedulingMode>fair</schedulingMode>
                <aclAdministerApps>xxx1 admins</aclAdministerApps>
                <aclSubmitApps>xxx1,xxx2,xxx3,root datadev</aclSubmitApps>
        </queue>
      </queue>
  <user name="xxx">
    <maxRunningApps>10</maxRunningApps>
  </user>
  <userMaxAppsDefault>10</userMaxAppsDefault>
</allocations>

发件人: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
发送时间: 2013年11月27日 16:33
收件人: user@hadoop.apache.org
主题: Re: problems of FairScheduler in hadoop2.2.0

Hi,

Can you share the contents of your fair-scheduler.xml?  If you submit just a single job, does
it run?  What do you see if you go to <resourcemanagerwebui>/ws/v1/cluster/scheduler?

-Sandy

On Wed, Nov 27, 2013 at 12:09 AM, 麦树荣 <shurong.mai@qunar.com<mailto:shurong.mai@qunar.com>>
wrote:
Hi, all

When I run jobs in hadoop 2.2.0,  I encounter a problem. Suddenly, the hadoop resourcemanager
cannot work normally: When I submit jobs and the jobs’ status all are “submitted” and
cannot run.
I cannot find any answers in the internet, who can give me some help? Thanks.

The resourcemanager log is as follows:

2013-11-27 14:39:10,749 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1129_000001
2013-11-27 14:39:11,050 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:11,050 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001
2013-11-27 14:39:11,051 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:11,051 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001
2013-11-27 14:39:11,753 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1129_000001
2013-11-27 14:39:11,754 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1129_000001
2013-11-27 14:39:12,055 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:12,055 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001
2013-11-27 14:39:12,056 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1128_000001
2013-11-27 14:39:12,056 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Request for appInfo of unknown attemptappattempt_138474337603
8_1127_000001

Mime
View raw message