hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
Date Thu, 19 Jun 2014 22:03:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037978#comment-14037978
] 

Colin Patrick McCabe commented on HDFS-5546:
--------------------------------------------

Hmm.  I think you're right about this.  {{Command#processPaths}} wraps the call to {{recursePath}}
in an IOE try... catch block.  So if we can't recurse into an individual child, we should
still be able to move forward with the other ones.  Of course, there is no such protection
for the paths specified on the command-line.  I tried looking for all the places IOE could
be thrown, but didn't manage to spot one that would abort the recursion because of a problem
with a child.  Eddy, can you run your unit test against trunk to verify all this?

> race condition crashes "hadoop ls -R" when directories are moved/removed
> ------------------------------------------------------------------------
>
>                 Key: HDFS-5546
>                 URL: https://issues.apache.org/jira/browse/HDFS-5546
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, HDFS-5546.2.001.patch,
HDFS-5546.2.002.patch
>
>
> This seems to be a rare race condition where we have a sequence of events like this:
> 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D.
> 2. someone deletes or moves directory D
> 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which calls DFS#listStatus(D).
This throws FileNotFoundException.
> 4. ls command terminates with FNF



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message