hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sourabh Goyal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11572) s3a delete() operation fails during a concurrent delete of child entries
Date Mon, 17 Jul 2017 12:47:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089770#comment-16089770
] 

Sourabh Goyal commented on HADOOP-11572:
----------------------------------------

[~stevel@apache.org]: We hit this issue recently in production. We applied the patch of this
Jira and got the following stack trace: 

{code}
17/07/15 12:40:24 ERROR s3a.S3AFileSystem: Partial failure of delete, 1 errors
com.amazonaws.services.s3.model.MultiObjectDeleteException: One or more objects could not
be deleted (Service: null; Status Code: 200; Error Code: null; Request ID: xxxxxxx), S3 Extended
Request ID: xxxxxxxxxxx
at com.amazonaws.services.s3.AmazonS3Client.deleteObjects(AmazonS3Client.java:1785)
at org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:775)
at org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:750)
at org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:709)
at org.apache.hadoop.fs.shell.MoveCommands$Rename.processPath(MoveCommands.java:110)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:248)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:328)
at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:300)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:282)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:266)
at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:211)
at org.apache.hadoop.fs.shell.Command.run(Command.java:175)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
17/07/15 12:40:24 ERROR s3a.S3AFileSystem: sourabhg/files5/part-10472-5f8b30f4-dc37-4419-9a7e-c1642ff5f0a1.parquet:
"InternalError" - We encountered an internal error. Please try again.
{code}

This is consistently reproducible if we try to delete ~30K or more files. 
As we can see that AWS gives *InternalError*. To fix this, we tried failed multi delete request
again and it worked.
 

> s3a delete() operation fails during a concurrent delete of child entries
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-11572
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11572
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>             Fix For: 2.9.0, 3.0.0-alpha4
>
>         Attachments: HADOOP-11572-001.patch, HADOOP-11572-branch-2-002.patch, HADOOP-11572-branch-2-003.patch
>
>
> Reviewing the code, s3a has the problem raised in HADOOP-6688: deletion of a child entry
during a recursive directory delete is propagated as an exception, rather than ignored as
a detail which idempotent operations should just ignore.
> the exception should be caught and, if a file not found problem, logged rather than propagated



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message