hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weiwei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11922) Ozone: KSM: Garbage collect deleted blocks
Date Mon, 05 Jun 2017 07:30:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036631#comment-16036631

Weiwei Yang commented on HDFS-11922:

Hi [~anu]

Thanks for filing this, it's pretty important. I noticed you add KSM tag in title, do you
think this is a work in KSM layer? I thought this is in SCM, because it's SCM communicates
with datanodes, it seems more straightforward that to let SCM scan orphan blocks in a backend
thread, and send container report response to datanodes. Then datanodes can work on cleaning
up corresponding container/chunks/files. Please let me know if I miss anything. Thanks.

> Ozone: KSM: Garbage collect deleted blocks
> ------------------------------------------
>                 Key: HDFS-11922
>                 URL: https://issues.apache.org/jira/browse/HDFS-11922
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Anu Engineer
>            Priority: Critical
> We need to garbage collect deleted blocks from the Datanodes. There are two cases where
we will have orphaned blocks. One is like the classical HDFS, where someone deletes a key
and we need to delete the corresponding blocks.
> Another case, is when someone overwrites a key -- an overwrite can be treated as a delete
and a new put -- that means that older blocks need to be GC-ed at some point of time. 
> Couple of JIRAs has discussed this in one form or another -- so consolidating all those
discussions in this JIRA. 
> HDFS-11796 -- needs to fix this issue for some tests to pass 
> HDFS-11780 -- changed the old overwriting behavior to not supporting this feature for
time being.
> HDFS-11920 - Once again runs into this issue when user tries to put an existing key.
> HDFS-11781 - delete key API in KSM only deletes the metadata -- and relies on GC for
> When we solve this issue, we should also consider 2 more aspects. 
> One, we support versioning in the buckets, tracking which blocks are really orphaned
is something that KSM will do. So delete and overwrite at some point needs to decide how to
handle versioning of buckets.
> Two, If a key exists in a closed container, then it is immutable, hence the strategy
of removing the key might be more complex than just talking to an open container.
> cc : [~xyao], [~cheersyang], [~vagarychen], [~msingh], [~yuanbo], [~szetszwo], [~nandakumar131]

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message