hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4209) The TaskAttemptID should not have the JobTracker start time
Date Fri, 19 Sep 2008 05:28:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632519#action_12632519

Owen O'Malley commented on HADOOP-4209:

After looking through the code some more and seeing the attempt ids like:


There are problems:
  1. The format of the task ids change depending on the context.
  2. The final number is way longer than it needs to be.
  3. The numbers are out of order for sorting.
  4. The change of the format of the task ids needs to be called out much more explicitly.

I think it would be much better to expand the retry out to 4 digits and increment by one each
time the job is running through a restart:

attempt_200707121733_0003_m_000005_0001   // fails once
attempt_200707121733_0003_m_000005_1000   // after a restart
attempt_200707121733_0003_m_000005_1001   // fails after restart
attempt_200707121733_0003_m_000005_2000   // after a second restart

That way, we keep the format consistent and compatible. It is a single variable to track in
the JobInProgress and is easy to explain. The only problem would come if you had 1000 failures
on an attempt and then had a JT reset.

> The TaskAttemptID should not have the JobTracker start time
> -----------------------------------------------------------
>                 Key: HADOOP-4209
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4209
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Owen O'Malley
>            Priority: Blocker
>             Fix For: 0.19.0
> The TaskAttemptID now includes the redundant copy of the JobTracker's start time as milliseconds.
We should instead change the JobID to have the longer unique string.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message