Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E2A63200CD0 for ; Tue, 25 Jul 2017 12:01:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E1115166735; Tue, 25 Jul 2017 10:01:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 32829166730 for ; Tue, 25 Jul 2017 12:01:05 +0200 (CEST) Received: (qmail 22902 invoked by uid 500); 25 Jul 2017 10:01:04 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 22891 invoked by uid 99); 25 Jul 2017 10:01:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Jul 2017 10:01:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B2427C3977 for ; Tue, 25 Jul 2017 10:01:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id xBeBrRtHEY9S for ; Tue, 25 Jul 2017 10:01:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 1F37060DB7 for ; Tue, 25 Jul 2017 10:01:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 54857E0E0A for ; Tue, 25 Jul 2017 10:01:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id CA13023F36 for ; Tue, 25 Jul 2017 10:01:00 +0000 (UTC) Date: Tue, 25 Jul 2017 10:01:00 +0000 (UTC) From: "Weiwei Yang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-11922) Ozone: KSM: Garbage collect deleted blocks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 25 Jul 2017 10:01:06 -0000 [ https://issues.apache.org/jira/browse/HDFS-11922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-11922: ------------------------------- Attachment: Async delete keys.pdf > Ozone: KSM: Garbage collect deleted blocks > ------------------------------------------ > > Key: HDFS-11922 > URL: https://issues.apache.org/jira/browse/HDFS-11922 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Reporter: Anu Engineer > Assignee: Weiwei Yang > Priority: Critical > Attachments: Async delete keys.pdf > > > We need to garbage collect deleted blocks from the Datanodes. There are two cases where we will have orphaned blocks. One is like the classical HDFS, where someone deletes a key and we need to delete the corresponding blocks. > Another case, is when someone overwrites a key -- an overwrite can be treated as a delete and a new put -- that means that older blocks need to be GC-ed at some point of time. > Couple of JIRAs has discussed this in one form or another -- so consolidating all those discussions in this JIRA. > HDFS-11796 -- needs to fix this issue for some tests to pass > HDFS-11780 -- changed the old overwriting behavior to not supporting this feature for time being. > HDFS-11920 - Once again runs into this issue when user tries to put an existing key. > HDFS-11781 - delete key API in KSM only deletes the metadata -- and relies on GC for Datanodes. > When we solve this issue, we should also consider 2 more aspects. > One, we support versioning in the buckets, tracking which blocks are really orphaned is something that KSM will do. So delete and overwrite at some point needs to decide how to handle versioning of buckets. > Two, If a key exists in a closed container, then it is immutable, hence the strategy of removing the key might be more complex than just talking to an open container. > cc : [~xyao], [~cheersyang], [~vagarychen], [~msingh], [~yuanbo], [~szetszwo], [~nandakumar131] > -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org