Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 62874 invoked from network); 16 Sep 2010 17:18:56 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Sep 2010 17:18:56 -0000 Received: (qmail 44682 invoked by uid 500); 16 Sep 2010 17:18:55 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 44186 invoked by uid 500); 16 Sep 2010 17:18:54 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 44178 invoked by uid 99); 16 Sep 2010 17:18:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Sep 2010 17:18:54 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Sep 2010 17:18:53 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o8GHIXoH026831 for ; Thu, 16 Sep 2010 17:18:33 GMT Message-ID: <22902925.233861284657513569.JavaMail.jira@thor> Date: Thu, 16 Sep 2010 13:18:33 -0400 (EDT) From: "Jimmy Hu (JIRA)" To: issues@hbase.apache.org Subject: [jira] Commented: (HBASE-2999) hbase TTL can be suboptimal and leave small regions after compaction In-Reply-To: <13550843.208291284575012819.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910230#action_12910230 ] Jimmy Hu commented on HBASE-2999: --------------------------------- It turns out that the hbase will do daily major_compac for the tables. for data that is older than TTL, the major_compact will actually remove all records if the region 's last timestamp is older than TTL. This resolved the old data issue. However, the majjr_compaction should be optimized so that when it notices a region's last timestamp is older than TTL, it should just go ahead and remove the file, instead of reading the file and discard record by record. It improves speed and reduce cpu/memory usage. However, after the major_compaction test, I found that I end up having several regions that have no data inside. and the regions are not merged even though they are empty and consecutive. That means, if we run this in production system and key is chronological order, we will end up having thousands of regions as time goes on and the number of regions never decrease, even though old data are compacted away. we don't really mind having several empty regions, but the fact that the region number continue to grow unlimited without stop as time goes on, is really troublesome. It waste hadoop namenode resource, and waste memory resource on regionserver, as each region takes some memory to store region info. J.D mentions that Currently merging regions can only be done while HBase is offline, a long time ago this was opened: https://issues.apache.org/jira/browse/HBASE-420. And some work was to at least be able to merge regions in disabled tables: https://issues.apache.org/jira/browse/HBASE-1621 but it requires a lot more engineering. Stack mentions that: It'd be easy enough to write a script to do this run out of cron but yeah, we should have a facility to sweep hbase and in particular if regions are empty of store files, merge to neighbour. > hbase TTL can be suboptimal and leave small regions after compaction > -------------------------------------------------------------------- > > Key: HBASE-2999 > URL: https://issues.apache.org/jira/browse/HBASE-2999 > Project: HBase > Issue Type: New Feature > Components: regionserver > Affects Versions: 0.89.20100621 > Environment: All > Reporter: Jimmy Hu > > Yes, Current TTL based on compaction is working as advertised if the key > randomly distribute the incoming data > among all regions. However, if the key is designed in chronological order, > the TTL doesn't really work, as no compaction > will happen for data already written. So we can't say that current TTL > really work as advertised, as it is key structure dependent. > This is a pity, because a major use case for hbase is for people to store > history or log data. normally people only > want to retain the data for a fixed period. for example, US government > default data retention policy is 7 years. Those > data are saved in chronological order. Current TTL implementation doesn't > work at all for those kind of use case. > In order for that use case to really work, hbase needs to have an active > thread that periodically runs and check if there > are data older than TTL, and delete the data older than TTL is necessary, > and compact small regions older than certain time period > into larger ones to save system resource. It can optimize the deletion by > delete the whole region if it detects that the last time > stamp for the region is older than TTL. There should be 2 parameters to > configure for hbase: > 1. whether to disable/enable the TTL thread. > 2. the interval that TTL will run. maybe we can use a special value like 0 > to indicate that we don't run the TTL thread, thus saving one configuration > parameter. > for the default TTL, probably it should be set to 1 day. > 3. How small will the region be merged. it should be a percentage of the > store size. for example, if 2 consecutive region is only 10% of the store > szie ( default is 256M), we can initiate a region merge. We probably need a > parameter to reduce the merge too. for example , we only merge for regions > who's largest timestamp > is older than half of TTL. > We are tracking min/max timestamps in storefiles currently, so it's possible that we could expire some files of a region as well, even if the region was not completely expired. So At minimum, we should be able to implement dropping the stores that is older than TTL. if all stores for a region is dropped, we should drop the whole region, > and update the key range of the adjacent region, so there is not a key hole left. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.