Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C5C4B173E0 for ; Tue, 4 Nov 2014 03:23:35 +0000 (UTC) Received: (qmail 2455 invoked by uid 500); 4 Nov 2014 03:23:35 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 2399 invoked by uid 500); 4 Nov 2014 03:23:35 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 2388 invoked by uid 99); 4 Nov 2014 03:23:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Nov 2014 03:23:35 +0000 Date: Tue, 4 Nov 2014 03:23:35 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-12363) KEEP_DELETED_CELLS considered harmful? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195643#comment-14195643 ] stack commented on HBASE-12363: ------------------------------- Skimmed patch. LGTM. Needs release note and a note in refguide (write the release note in a way in which it can get shoved in refguide)? Nice the way it keeps old behavior. Did you do this: bq. I also need to fix the long lines and put an interface annotation/comment/license into the KeepDeletedCells enum. Can do on commit. > KEEP_DELETED_CELLS considered harmful? > -------------------------------------- > > Key: HBASE-12363 > URL: https://issues.apache.org/jira/browse/HBASE-12363 > Project: HBase > Issue Type: Sub-task > Components: regionserver > Reporter: Lars Hofhansl > Assignee: Lars Hofhansl > Labels: Phoenix > Fix For: 2.0.0, 0.98.8, 0.99.2 > > Attachments: 12363-master.txt, 12363-test.txt, 12363-v2.txt, 12363-v3.txt > > > Brainstorming... > This morning in the train (of all places) I realized a fundamental issue in how KEEP_DELETED_CELLS is implemented. > The problem is around knowing when it is safe to remove a delete marker (we cannot remove it unless all cells affected by it are remove otherwise). > This was particularly hard for family marker, since they sort before all cells of a row, and hence scanning forward through an HFile you cannot know whether the family markers are still needed until at least the entire row is scanned. > My solution was to keep the TS of the oldest put in any given HFile, and only remove delete markers older than that TS. > That sounds good on the face of it... But now imagine you wrote a version of ROW 1 and then never update it again. Then later you write a billion other rows and delete them all. Since the TS of the cells in ROW 1 is older than all the delete markers for the other billion rows, these will never be collected... At least for the region that hosts ROW 1 after a major compaction. > Note, in a sense that is what HBase is supposed to do when keeping deleted cells: Keep them until they would be removed by some other means (for example TTL, or MAX_VERSION when new versions are inserted). > The specific problem here is that even as all KVs affected by a delete marker are expired this way the marker would not be removed if there just one older KV in the HStore. > I don't see a good way out of this. In parent I outlined these four solutions: > So there are three options I think: > # Only allow the new flag set on CFs with TTL set. MIN_VERSIONS would not apply to deleted rows or delete marker rows (wouldn't know how long to keep family deletes in that case). (MAX)VERSIONS would still be enforced on all rows types except for family delete markers. > # Translate family delete markers to column delete marker at (major) compaction time. > # Change HFileWriterV* to keep track of the earliest put TS in a store and write it to the file metadata. Use that use expire delete marker that are older and hence can't affect any puts in the file. > # Have Store.java keep track of the earliest put in internalFlushCache and compactStore and then append it to the file metadata. That way HFileWriterV* would not need to know about KVs. > And I implemented #4. > I'd love to get input on ideas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)