hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13221) s3a create() doesn't check for a parent path being a file
Date Mon, 25 Jul 2016 16:40:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392252#comment-15392252

Steve Loughran commented on HADOOP-13221:

This has the potential to be very expensive. it could perhaps be done asynchronously, with
the create() call starting a check which surfaces as a failure in a subsequent write() operation.
Even there, given that s3a doesn't write files until close(), there's a race condition. A
create() check may pass, but if a file is later created further up the directory tree, the
final close() would still be created.

> s3a create() doesn't check for a parent path being a file
> ---------------------------------------------------------
>                 Key: HADOOP-13221
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13221
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.7.2
>            Reporter: Steve Loughran
>            Assignee: Rajesh Balamohan
> Seen in a code review. Notable that if true, this got by all the FS contract tests —showing
we missed a couple.
> {{S3AFilesystem.create()}} does not examine its parent paths to verify that there does
not exist one which is a file. It looks for the destination path if overwrite=false (see HADOOP-13188
for issues there), but it doesn't check the parent for not being a file, or the parent of
that path.
> It must go up the tree, verifying that either a path does not exist, or that the path
is a directory. The scan can stop at the first entry which is is a directory, thus the operation
is O(empty-directories) and not O(directories).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message