hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3881) IPC client doesnt time out if far end handler hangs
Date Fri, 01 Aug 2008 15:38:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619046#action_12619046
] 

Doug Cutting commented on HADOOP-3881:
--------------------------------------

> a bit of jitter is needed [ ... ]

There is jitter in block reports. and in ExponentialBackoffRetry.  I have not heard of folks
having problems on cluster restart.

> Also, maybe the IPC and design decisions could be documented in the wiki

The problem with detailed code documentation separate from the code is that it quickly goes
stale.  The internal design is dynamic.  What's better for this is good documentation in the
code, since that is more naturally maintained as the code changes.  It's best to only use
separate documentation for slower-moving targets like end-user API documentation and high-level
architectural documentation.


> IPC client doesnt time out if far end handler hangs
> ---------------------------------------------------
>
>                 Key: HADOOP-3881
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3881
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is what appears to be happening in some changes of mine that (inadventently) blocked
JobTracker: if the client can connect to the far end and invoke an operation, the far end
has forever to deal with the request: the client blocks too.
> Clearly the far end shouldn't do this; its a serious problem to address. but should the
client hang? Should it not time out after some specifiable time and signal that the far end
isn't processing requests in a timely manner? 
> (marked as minor as this shouldn't arise in day to day operation. but it should be easy
to create a mock object to simulate this, and timeouts are considered useful in an IPC)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message