hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1143) Implement Background deletion
Date Sat, 15 May 2010 06:59:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867819#action_12867819
] 

Scott Chen commented on HDFS-1143:
----------------------------------

In this patch, we add a inner class called subtreeCleaner in FSNamesystem to perform the background
deletion.
This class contains a single thread executor and provides a method called asyncDelete(INode).
This method cleanup the detached subtree and its blocks in background.

We also modify the delete method so that it detaches the subtree, delete file lease and then
call asyncDelete().

On my PC, the original unit test takes about 1000ms to perform the large deletion.
The modified one takes about 20ms.

> Implement Background deletion
> -----------------------------
>
>                 Key: HDFS-1143
>                 URL: https://issues.apache.org/jira/browse/HDFS-1143
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Dmytro Molkov
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>
>         Attachments: HDFS-1143.txt
>
>
> Right now if you try to delete massive number of files from the namenode it will freeze
(sometimes for minutes). Most of the time is spent going through the blocks map and invalidating
all the blocks.
> This can probably be improved by having a background GC process. The deletion will basically
just remove the inode being deleted and then give the subtree that was just deleted to the
background thread running cleanup.
> This way the namenode becomes available for the clients soon after deletion, and all
the heavy operations are done in the background.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message