hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed
Date Wed, 25 Jun 2014 19:41:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043956#comment-14043956
] 

Colin Patrick McCabe commented on HDFS-5546:
--------------------------------------------

I agree with a lot of the stuff that's been presented, but I also think our behavior should
be consistent beween "{{ls /a1/b /a2/b}}" and "{{ls /a\{1,2\}/b}}", and right now I can't
see a good way to achieve that if we catch IOE (since the globber does not catch IOE)  On
the other hand, if we catch FNF and continue if a top-level directory disappears on us, then
we are making things more consistent, since the globber catches and ignores IOEs (when dealing
with globs).

bq. Colin Patrick McCabe shouldn't the globStatus() be out of scope for this JIRA? Maybe we
should open another related JIRA?

I'm not sure how the globber would report IOE other than throwing it.  We'd have to return
a list of {{Option<FileStatus, IOException>}} or something?  It doesn't seem like the
kind of change that could be made compatibly, since we'd need a new interface.

So overall I would lean towards just catching FNF at the top-level, like the earlier patch
did.  And maybe revisiting this later if we have better ideas about how to handle this in
the globber as well.  [~daryn], [~eddyxu], does that make sense?  Or am I trying too hard
to be consistent? :)

> race condition crashes "hadoop ls -R" when directories are moved/removed
> ------------------------------------------------------------------------
>
>                 Key: HDFS-5546
>                 URL: https://issues.apache.org/jira/browse/HDFS-5546
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, HDFS-5546.2.001.patch,
HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, HDFS-5546.2.004.patch
>
>
> This seems to be a rare race condition where we have a sequence of events like this:
> 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D.
> 2. someone deletes or moves directory D
> 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which calls DFS#listStatus(D).
This throws FileNotFoundException.
> 4. ls command terminates with FNF



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message