hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11922) Ozone: KSM: Garbage collect deleted blocks
Date Fri, 02 Jun 2017 19:16:04 GMT
Anu Engineer created HDFS-11922:

             Summary: Ozone: KSM: Garbage collect deleted blocks
                 Key: HDFS-11922
                 URL: https://issues.apache.org/jira/browse/HDFS-11922
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: ozone
            Reporter: Anu Engineer
            Priority: Critical

We need to garbage collect deleted blocks from the Datanodes. There are two cases where we
will have orphaned blocks. One is like the classical HDFS, where someone deletes a key and
we need to delete the corresponding blocks.

Another case, is when someone overwrites a key -- an overwrite can be treated as a delete
and a new put -- that means that older blocks need to be GC-ed at some point of time. 

Couple of JIRAs has discussed this in one form or another -- so consolidating all those discussions
in this JIRA. 

HDFS-11796 -- needs to fix this issue for some tests to pass 
HDFS-11780 -- changed the old overwriting behavior to not supporting this feature for time
HDFS-11920 - Once again runs into this issue when user tries to put an existing key.
HDFS-11781 - delete key API in KSM only deletes the metadata -- and relies on GC for Datanodes.

When we solve this issue, we should also consider 2 more aspects. 

One, we support versioning in the buckets, tracking which blocks are really orphaned is something
that KSM will do. So delete and overwrite at some point needs to decide how to handle versioning
of buckets.

Two, If a key exists in a closed container, then it is immutable, hence the strategy of removing
the key might be more complex than just talking to an open container.
cc : [~xyao], [~cheersyang], [~vagarychen], [~msingh], [~yuanbo], [~szetszwo], [~nandakumar131]


This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message