Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 89267 invoked from network); 17 Aug 2009 20:24:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Aug 2009 20:24:30 -0000 Received: (qmail 83889 invoked by uid 500); 17 Aug 2009 20:24:37 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 83824 invoked by uid 500); 17 Aug 2009 20:24:36 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 83814 invoked by uid 99); 17 Aug 2009 20:24:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Aug 2009 20:24:36 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Aug 2009 20:24:35 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id BFCBE234C044 for ; Mon, 17 Aug 2009 13:24:14 -0700 (PDT) Message-ID: <365081694.1250540654784.JavaMail.jira@brutus> Date: Mon, 17 Aug 2009 13:24:14 -0700 (PDT) From: "stack (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-792) Rewrite getClosestAtOrJustBefore; doesn't scale as currently written In-Reply-To: <1808226118.1217559391675.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744183#action_12744183 ] stack commented on HBASE-792: ----------------------------- HBASE-1761 rewrote getclosestatorbefore. Code is much cleaner and more focused on the target key. We can do this now because of such as the axiom that deletes only apply to the flie that follows. Doesn't carry around bulky Maps of candidates nor of deletes (now we have new style deletes) any more so should be more performant. The one thing left to do is an early-out if we get an answer early in the processing -- in memstore say. I tried to do this as part of hbase-1761 but only worked if client asked for the first row in a region. Need to make it so getclosest when a meta table leverages HRegionInfo. If target row key falls between the start and end key or the region, answer is the one we want so exit. > Rewrite getClosestAtOrJustBefore; doesn't scale as currently written > -------------------------------------------------------------------- > > Key: HBASE-792 > URL: https://issues.apache.org/jira/browse/HBASE-792 > Project: Hadoop HBase > Issue Type: Bug > Reporter: stack > Assignee: stack > Priority: Blocker > Attachments: 792.patch > > > As currently written, as a table gets bigger, the number of rows .META. needs to keep count of grows. > As written, our getClosestAtOrJustBefore, goes through every storefile and in each picks up any row that could be a possible candidate for closest before. It doesn't just get the closest from the storefile, but all keys that are closest before. Its not selective because how can it tell at the store file level which of the candidates will survive deletes that are sitting in later store files or up in memcache. > So, if a store file has keys 0-10 and we ask to get the row that is closest or just before 7, it returns rows 0-7.. and so on per store file. > Can bet big and slow weeding key wanted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.