hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1880) When Namenode network is unplugged, DFSClient operations waits for ever
Date Fri, 01 Jul 2011 12:38:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13058525#comment-13058525
] 

Uma Maheswara Rao G commented on HDFS-1880:
-------------------------------------------

Since we are moving towards HA implementations, this issue will create many problems.
 We were observing the same in our HA clusters.
 
Here the actual problem is at:
 {code}
     
    public int read(byte[] buf, int off, int len) throws IOException {
        do {
          try {
            return super.read(buf, off, len);
          } catch (SocketTimeoutException e) {
            handleTimeout(e);
          }
        } while (true);
      }
 
 {code}

When we unplug the network cable, this super.read will throw SocketTimeoutException.
It is handled the SocketTimeoutException and again it will trying to send the ping request.

SO, this loop is getting repeated.

bq. So I feel we can add some configuration to retry for that specific interval so that as
per the need.

Yes , we may need to control this retries, so that it can break this loop after some number
of retries.Because continusly getting timeout exception also can be consider as some problem
in cluster environment.


> When Namenode network is unplugged, DFSClient operations waits for ever
> -----------------------------------------------------------------------
>
>                 Key: HDFS-1880
>                 URL: https://issues.apache.org/jira/browse/HDFS-1880
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>            Reporter: Uma Maheswara Rao G
>
> When NN/DN is shutdown gracefully, the DFSClient operations which are waiting for a response
from NN/DN, will throw exception & come out quickly
> But when the NN/DN network is unplugged, the DFSClient operations which are waiting for
a response from NN/DN, waits for ever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message