hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bob Hansen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-10781) libhdfs++: redefine NN timeout to be "time without a response"
Date Fri, 19 Aug 2016 20:29:20 GMT
Bob Hansen created HDFS-10781:

             Summary: libhdfs++: redefine NN timeout to be "time without a response"
                 Key: HDFS-10781
                 URL: https://issues.apache.org/jira/browse/HDFS-10781
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Bob Hansen

In the find tool, we submit a zillion requests to the NameNode asynchronously.  As the queue
on the NameNode grows, the time to response for each individual message will increase.  In
the find tool, we were eventually getting timeouts on requests, even though the NN was respoinding
as fast as its little feet could carry it.

I propose that we should redefine timeouts to be on a per-connection basis rather than per-request.
 If a client has an outstanding request to the NN but hasn't gotten a response back within
n msec, it should declare the connection dead and retry.  As long as the NameNode is being
responsive to the best of its ability and providing data, we will not declare the link dead.

One potential for Failure of Least Astonishment here is that it will mean any particular request
from a client cannot be depended on to get a positive or negative response within a fixed
amount of time, but I think that may be a good trade to make.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

View raw message