hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11398) RetryUpToMaximumTimeWithFixedSleep needs to behave more accurately
Date Fri, 10 Jul 2015 17:15:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622604#comment-14622604
] 

Li Lu commented on HADOOP-11398:
--------------------------------

Hi [~ajisakaa], there is one issue with your current fix: we need to decide the timeLimit
on the time the first retry happens, but not on the object creation. Say a retry object is
created at t1 with maximum time delta_t, an actual retry happens at t2. We want the retry
to stop at t2+delta_t, but not t1+delta_t. Actually there may be a quite significant gap between
t1 and t2, so setting timeLimit to t1+delta_t may not be right. I'm not sure if in all of
our use cases we can safely assume or enforce t1=t2. 

> RetryUpToMaximumTimeWithFixedSleep needs to behave more accurately
> ------------------------------------------------------------------
>
>                 Key: HADOOP-11398
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11398
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: HADOOP-11398-121114.patch, HADOOP-11398.002.patch
>
>
> RetryUpToMaximumTimeWithFixedSleep now inherits RetryUpToMaximumCountWithFixedSleep and
just acts as a wrapper to decide maxRetries. The current implementation uses (maxTime / sleepTime)
as the number of maxRetries. This is fine if the actual for each retry is significantly less
than the sleep time, but it becomes less accurate if each retry takes comparable amount of
time as the sleep time. The problem gets worse when there are underlying retries. 
> We may want to use timers inside RetryUpToMaximumTimeWithFixedSleep to perform accurate
timing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message