hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karam Singh (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5794) Sometimes job does not get removed from scheduler queue after it is killed
Date Fri, 08 May 2009 13:19:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707336#action_12707336
] 

Karam Singh commented on HADOOP-5794:
-------------------------------------

Cluster setup -  : 
Cluster Capacity = 204 maps, 204 reduces
4 queues 
Q1 Capacity Percent= 40
Q2 Capacity Percent= 40
Q3 Capacity Percent= 40
Q4 Capacity Percent= 40

Each queue has user limit=100%
Submitted 8 jobs to each queue. Total 32 sleep jobs were submitted with each job having maps=10000
(sleep time 5 secs), reduce=2 (sleep time 1 min).
All jobs were initialized. Out which maps of 4 maps started running. When at least 1000 maps
of each job completed, re-started JobTracker.
After recovery of JobTracker, waited up to the time when 4 jobs got completed. Killed all
remaining 28 jobs.
All jobs got killed successfully.
JobTracker webui displayed all killed jobs under failed jobs list. hadoop job -list all also
displays the status of 28 killed job as 5.
While browsing through jobqueue_details.jsp pages of queues found that 2 jobs which were killed
have not been removed from queue of capacity scheduler. Maps of both jobs were running before
kill was sent to them.
To check that cluster should be blocked because of this, submitted 3 more jobs to each queue
where 2 killed were listed and verified the newly submitted jobs ran successfully.
Waited up to 20 mins before shutting down the cluster


> Sometimes job does not get removed from scheduler queue after it is killed
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-5794
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5794
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.20.0
>            Reporter: Karam Singh
>
> Sometimes when we kill a job, it does get removed from waiting queue, while job status:
"Killed" with Job Setup and Cleanup: "Successful" 
> Also JobTracker webui shows job under failed jobs lists and hadoop job -list all, hadoop
queue <queuename> -showJobs also shows jobs state=5.
> Prior to killing job state was "Running"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message