hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-5068) testClusterBlockingForLackOfMemory in TestCapacityScheduler fails randomly
Date Tue, 20 Jan 2009 09:37:59 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vinod K V updated HADOOP-5068:
------------------------------

    Attachment: HADOOP-5068-20090120-git.txt

I could reproduce the problem, it surfaces very often in consecutive runs. The reason for
the (random) failure:

In FakeTaskTrackerManager.killJob(),
{code}
    public void killJob(JobID jobid) throws IOException {
      JobInProgress job = jobs.get(jobid);
      finalizeJob(job, JobStatus.KILLED);
      job.kill();
    }
{code}
If the job state becomes RUNNING back again after the finalizeJob call, job.kill() will throw
the above posted exception. This is possible when JobInitializationPoller calls FakeJobInProgress.initTasks()
after finalizeJob method call finishes but before job.kill() starts.

This failure mostly resulted after the fix for asynchronizing initTasks via JobInitializationPoller
went in.

Attaching patch. Removed job.kill() from killJob() as it is truly not needed. Also, used ControlledJobInitialization
so that initialization poller doesn't come our way. I ran the test many times now, and do
not see any failures any more.

> testClusterBlockingForLackOfMemory in TestCapacityScheduler fails randomly
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-5068
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5068
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Sreekanth Ramakrishnan
>            Assignee: Vinod K V
>         Attachments: HADOOP-5068-20090120-git.txt
>
>
> testClusterBlockingForLackOfMemory fails randomly when TestCapacityScheduler is run.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message