hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Noguchi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5712) Namenode slowed down when many files with same filename were moved to Trash
Date Mon, 11 May 2009 17:51:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708134#action_12708134

Koji Noguchi commented on HADOOP-5712:

bq. At present, it is very easy to find out the last version of the file in Trash.

If all the nodes in the cluster are in the same timezone, timestamp would (almost) serve this

bq. Another option would be to make the Trash client retrieve the contents of the Trash directory
and then scan what files pre-exist in the list. 

listStatus is one of the most expensive call to Namenode right now.  
I really want to avoid an extra overhead to the namenode with this common command.

bq. Also, this code is not fool-proof because there is no atomicity between the exists and
the rename. 

True.  But I haven't seen this become a problem yet.
For me, the contract is we *try* to move the files to Trash but if that fails, we simply delete
We completely delete the files if rename fails twice in a row anyway.

In short, I want the Trash feature to stay as simple as it is now without involving the Namenode

> Namenode slowed down when many files with same filename were moved to Trash
> ---------------------------------------------------------------------------
>                 Key: HADOOP-5712
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5712
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.3
>            Reporter: Koji Noguchi
>            Priority: Minor

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message