hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13712) S3A open to avoid needless HEAD on the successful execution path
Date Wed, 02 Nov 2016 18:27:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629959#comment-15629959

Steve Loughran commented on HADOOP-13712:

We need to know the content length when opening the file. And with lazy seek, there's no GET
request initiated in the open() call —only on the first actual read(). Which means the requirement
"open("nonexistent file) MUST raise FNFE" won't hold.

closing as a WONTFIX

> S3A open to avoid needless HEAD on the successful execution path
> ----------------------------------------------------------------
>                 Key: HADOOP-13712
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13712
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.7.3
>            Reporter: Steve Loughran
> S3A's open() operation does a {{getFileStatus()}} check to see if a file is not a directory
before opening with a GET. That initial check will take up at least one HEAD request if the
file is present, more if it isn't.
> As the GET itself performs the existence check, it is needless. A successful GET of a
path which doesn't end in "/" means a file was there. The only reason a getFileStatus call
is needed is to choose which error message to display if the path isn't there: is it an FNFE
or is it path-is-directory.
> Proposed: reorder the code to do the GET; only if that fails fallback to getFileStatus()

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message