hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7163) WebHdfsFileSystem should retry reads according to the configured retry policy.
Date Fri, 20 Nov 2015 23:24:10 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15019039#comment-15019039
] 

Eric Payne commented on HDFS-7163:
----------------------------------

[~wheat9], thank you for your review and comments on this feature.

bq. I think retrying only on the data node is problematic as the retry might have little value
when the DN goes down.

In this patch, if the DN that is being read from goes down, WebHDFS will put that DN into
the client's URL exclude list before querying the NN again for another DN. The only time the
same DN is reused is if a seek has occurred.

bq. An alternative approach is to have WebHDFS (1) expose a GET_BLOCK call where the DN returns
the block directly, and (2) be a smarter client that retries based on block locations.

Although this may be a more elegant solution, I think that could be done as part of a separate
JIRA, given that we can take advantage of the exclude list functionality as I mentioned above.


> WebHdfsFileSystem should retry reads according to the configured retry policy.
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-7163
>                 URL: https://issues.apache.org/jira/browse/HDFS-7163
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 3.0.0, 2.5.1
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: HDFS-7163-branch-2.003.patch, HDFS-7163-branch-2.7.003.patch, HDFS-7163.001.patch,
HDFS-7163.002.patch, HDFS-7163.003.patch, WebHDFS Read Retry.pdf
>
>
> In the current implementation of WebHdfsFileSystem, opens are retried according to the
configured retry policy, but not reads. Therefore, if a connection goes down while data is
being read, the read will fail and the read will have to be retried by the client code.
> Also, after a connection has been established, the next read (or seek/read) will fail
and the read will have to be restarted by the client code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message