hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1704) Throttling for HDFS Trash purging
Date Thu, 06 Sep 2007 17:16:32 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525470
] 

Doug Cutting commented on HADOOP-1704:
--------------------------------------

The problem is that applications can also easily delete large chunks of the filesystem already.
 So I don't see this as a Trash-specific issue.  Perhaps we can open a new issue generically
about namenode memory use for large deletes?  I suppose the ultimate goal there would be to
be able to delete the entire filesystem without growing the heap at all?  Would anything less
satisfy folks?  Should we also file issues about adding or opening lots of files at once?
 Seriously, we could, based on the available heap and our knowledge of the size of various
data structures, try to limit activity to always stay within the heap, returning application
exceptions (TooManyFiles) when these may be exceeded.  Short of that, I don't see how we can
really address stuff like this without a major re-write of the namenode, so that it does not
use single-host, memory-resident datastructures for each file and block.

+1 for closing it.

> Throttling for HDFS Trash purging
> ---------------------------------
>
>                 Key: HADOOP-1704
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1704
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> When HDFS Trash is enabled, deletion of a file/directory results in it being moved to
the "Trash" directory. The "Trash" directory is periodically purged by the Namenode. This
means that all files/directories that users deleted in the last Trash period, gets "really"
deleted when the Trash purging occurs. This might cause a burst of file/directory deletions.
> The Namenode tracks blocks that belonged to deleted files in a data structure named "RecentInvalidateSets".
There is a possibility that Trash purging may cause this data structure to bloat, causing
undesireable behaviour of the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message