Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC0B911FBB for ; Tue, 23 Sep 2014 05:05:14 +0000 (UTC) Received: (qmail 64657 invoked by uid 500); 23 Sep 2014 05:05:12 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 64590 invoked by uid 500); 23 Sep 2014 05:05:12 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 64577 invoked by uid 99); 23 Sep 2014 05:05:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Sep 2014 05:05:11 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lhofhansl@yahoo.com designates 72.30.238.200 as permitted sender) Received: from [72.30.238.200] (HELO nm37-vm0.bullet.mail.bf1.yahoo.com) (72.30.238.200) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Sep 2014 05:05:04 +0000 Received: from [66.196.81.172] by nm37.bullet.mail.bf1.yahoo.com with NNFMP; 23 Sep 2014 05:04:42 -0000 Received: from [98.139.212.196] by tm18.bullet.mail.bf1.yahoo.com with NNFMP; 23 Sep 2014 05:04:42 -0000 Received: from [127.0.0.1] by omp1005.mail.bf1.yahoo.com with NNFMP; 23 Sep 2014 05:04:42 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 896598.77206.bm@omp1005.mail.bf1.yahoo.com Received: (qmail 80275 invoked by uid 60001); 23 Sep 2014 05:04:42 -0000 X-YMail-OSG: Ak53cd4VM1m7LUVBENobnegdIDo..2a5Uvb6rYjdaLDT.ey LmArBuqwAFbSBoE5VDN1x86OPM0mgjQMEKSEyMZfl.ESpUwD_TA9v2dyXezv qurDty3Y0KmgYBPn_hYoCNi4jExTlz8WFCAW0VXQWDy9RyrydQOd0Sse23nA sCN8CT6vbCDO.yeWdxXVabsToV0kWbG80vFajN4.6b3SyMuSalsYJlCVGPha skcTARo.UrvHNyXeF1Fov3eXaBZy_5f6xqzEmTzvHpDeNXhav9BFryGkihL8 rETStyKRPGQ7QzUillHi5g0V3i5800BD.EL4QRfdTvVI0nMM8uHWTWQeiDDF wtmLPcr4nTPHUgy9V7mR239xnbMKTqJcFJKxKZGtBe.AOWQYL8Lb.L7BkmHg wEeZsP5SDnig3lNnbWFSvfqY_lz7gBQn54r3M2zpynzewaVYG6_jDhPoRQDP g13tGfQrLUY4kxnLBzp_JE4AT1E8YLPkCODh1U_CYbm1vgRKCCuFTPwWdSbZ JfM4d8PFIW.ml65KrZSCIPLqzv_7P_to.Ea7OqrvvhSklyH0- Received: from [24.4.160.78] by web140603.mail.bf1.yahoo.com via HTTP; Mon, 22 Sep 2014 22:04:42 PDT X-Rocket-MIMEInfo: 002.001,WW91IGNhbiB1c2UgdGhlIGhiYXNlLmhzdG9yZS50aW1lLnRvLnB1cmdlLmRlbGV0ZXMgY29uZmlnIG9wdGlvbi4KWW91IGNhbiBzZXQgaXQgZ2xvYmFsbHkgb3IgcGVyIENvbHVtbiBGYW1pbHkuCgpUaGlzIGlzIHRoZSBkZXNjcmlwdGlvbiBpbiBoYmFzZS1kZWZhdWx0LnhtbDoKICA8cHJvcGVydHk.CiAgICA8bmFtZT5oYmFzZS5oc3RvcmUudGltZS50by5wdXJnZS5kZWxldGVzPC9uYW1lPgogICAgPHZhbHVlPjA8L3ZhbHVlPgogICAgPGRlc2NyaXB0aW9uPlRoZSBhbW91bnQgb2YgdGltZSB0byBkZWxheSABMAEBAQE- X-RocketYMMF: lhofhansl X-Mailer: YahooMailWebService/0.8.203.696 References: Message-ID: <1411448682.78950.YahooMailNeo@web140603.mail.bf1.yahoo.com> Date: Mon, 22 Sep 2014 22:04:42 -0700 From: lars hofhansl Reply-To: lars hofhansl Subject: Re: Configuring tombstone purge independent of deleted cell purge To: "user@hbase.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org You can use the hbase.hstore.time.to.purge.deletes config option. You can set it globally or per Column Family. This is the description in hbase-default.xml: hbase.hstore.time.to.purge.deletes 0 The amount of time to delay purging of delete markers with future timestamps. If unset, or set to 0, all delete markers, including those with future timestamps, are purged during the next major compaction. Otherwise, a delete marker is kept until the major compaction which occurs after the marker's timestamp plus the value of this setting, in milliseconds. That seems to be exactly what you want. -- Lars ----- Original Message ----- From: James Estes To: user@hbase.apache.org Cc: Sent: Monday, September 22, 2014 10:39 AM Subject: Configuring tombstone purge independent of deleted cell purge Could tombstone purges be independent of purging deleted cells and KEEP_DELETED_CELLS setting? In my use case, I do not want to keep deleted cells, but I do need to keep the tombstones around. Without the tombstones, I'm not able to do incremental backups (custom, we do timerange raw scans ourselves for this). As a rough example, if I have the following timeline for the same row key (where t# is time): t0 - full backup (using a time range up to t0) t1 - PUT v1 t2 - incremental backup #1 (time range t0 up to t2) t3 - DELETE t4 - flush and major compaction happens t5 - incremental backup #2 (time range t2 up to t5) t6 - full system crash t7 - data restored from full backup + incrementals #1 and #2 When the restore completes, the row will have been un-deleted. This is because the incremental backup in #2 will not have the tombstone, since it gets compacted out. So in our case, I do NOT want to keep deleted cells (because I do not want the cells to show up in time range scans users may do), but I DO want to keep the tombstones for a configurable amount of time (much larger than our planned incremental backup schedule) so they are captured during backup. This would allow for the custom incremental backups to be independent of major compactions. Without it, the backup schedule would have to manually handle compactions and would always have to do a FULL Backup after a major compaction (otherwise there can be loss because when any major compaction happens, any tombstone that came in after the last incremental will be lost). It seems like there could be another setting for when to purge tombstones. Currently, there is hbase.hstore.time.to.purge.deletes for when to purge deleted cells, but ONLY if KEEP_DELETED_CELLS is configured (which makes sense). I'd like to propose a hbase.hstore.time.to.purge.tombstones that could default to the same value as hbase.hstore.time.to.purge.deletes, but would take effect regardless of the KEEP_DELETED_CELLS setting. It should have a constraint so that hbase.hstore.time.to.purge.deletes < hbase.hstore.time.to.purge.tombstones (b/c we don't want tombstones disappearing before the deleted cells). Does this seem reasonable? Is there another approach I might take? Thanks,