hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Shi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13404) RPC call hangs when server side CPU overloaded
Date Fri, 22 Jul 2016 01:49:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388747#comment-15388747
] 

Peter Shi commented on HADOOP-13404:
------------------------------------

I think there are 2 solution

1) add ping response in RPC server, and check the response in client side. Need client side
and server side modification, which may have some compatibility issue.
2) add thread to scan the  calls inside the connection, send timeout exception to the response
if the call do not get response for a long time. This is only client side solution.

> RPC call hangs when server side CPU overloaded
> ----------------------------------------------
>
>                 Key: HADOOP-13404
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13404
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Peter Shi
>
> In our reliability test, in namenode, inject fault like cpu 100% consumed, after fault
injection, for existing connection, all the request will hangs forever, not timeout. for new
coming connection, it will failover to another namenode in HA deployment.
> There is no timeout mechanism for calls on established connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message