hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4939) Create a test that would inject random failures for tasks in large jobs and would also inject TaskTracker failures
Date Wed, 31 Dec 2008 14:20:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660082#action_12660082
] 

Vinod K V commented on HADOOP-4939:
-----------------------------------

In the minimum, we should do our best and ensure that we are sending signals to the right
process. For this, we might want to grep process list for "java" AND the full class name,
instead of just searching for the daemons' names. We are limiting to one user anyways, and
further we are just stopping and continuing the process, and not really destroying them.

Gone throught the patch, a few code comments:
 - runSleepJobTest, runRandomWriterTest and runSortTest can be refactored, they share much
code.
 - If we can somehow get TASKTRACKER_EXPIRY_INTERVAL from JT - either via a public API or
may be via clusterStatus - it would make the tests better as compared to just relying on the
user input.
 - In KillTrackerThread.{startTaskTrackers|stopTaskTrackers}, the output of the shellCommand
is currently discarded. That, along with the return code will give more information about
the success of the signal sent.
 - If configuration is not setup(mapred-site.xml), local jobrunner would be used and the test
fails with little error reporting. I think we can check for jt configuration in the minimum.

> Create a test that would inject random failures for tasks in large jobs and would also
inject TaskTracker failures
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4939
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4939
>             Project: Hadoop Core
>          Issue Type: Sub-task
>          Components: mapred, test
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>             Fix For: 0.20.0
>
>         Attachments: 4939.patch
>
>
> Create a test that would inject random failures for tasks in large jobs and would also
inject TaskTracker failures

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message