Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1E49278ED for ; Tue, 13 Dec 2011 22:38:05 +0000 (UTC) Received: (qmail 87018 invoked by uid 500); 13 Dec 2011 22:38:05 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 86979 invoked by uid 500); 13 Dec 2011 22:38:04 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 86971 invoked by uid 99); 13 Dec 2011 22:38:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Dec 2011 22:38:04 +0000 X-ASF-Spam-Status: No, hits=-2001.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Dec 2011 22:37:54 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id F08CB111AEC for ; Tue, 13 Dec 2011 22:37:31 +0000 (UTC) Date: Tue, 13 Dec 2011 22:37:31 +0000 (UTC) From: "jiraposter@reviews.apache.org (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1747222875.7866.1323815851986.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168799#comment-13168799 ] jiraposter@reviews.apache.org commented on HBASE-2600: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/#review3887 ----------------------------------------------------------- I thought the change would be bigger than this. Does it work? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java Should this be 'starts with'? Are 0x01 and 0x02 good characters to have here? They are unprintable. Would it be better to have printables? More friendly to, you know, those humans that have to look at this stuff. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java Whats this define for? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java Why this stray ';'? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java This define name is hard to grok too src/main/java/org/apache/hadoop/hbase/HRegionInfo.java Whats up w/ your formatting here? Here and a few lines down for the @return? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java A bit of a comment here on why this math would help the reader. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java Can you not just put DELIMITER here? Ditto for the puts above? Do you have to put it into oneByte first? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java Would it be error if a null id? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java method names do not begin with capital letters. Line is too long src/main/java/org/apache/hadoop/hbase/HRegionInfo.java Formatting is off in this method? src/main/java/org/apache/hadoop/hbase/KeyValue.java Will this always give same name? Doesn't uuid have time and machine name inputs? Does this belong in here anyways? src/main/java/org/apache/hadoop/hbase/KeyValue.java line too long src/main/java/org/apache/hadoop/hbase/KeyValue.java Would suggest you not change the formatting already in place; blend in instead (lines too long anyway) src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java Needs class comment. What is this class replacing? src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java public methods need javadoc? src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java Do we need zeros? Is this tablename its uuid? Maybe we can't do uuid if it has host and time factors? Maybe need to sha1/md5 it? Something that will always give us same answer regardless of when we hash or where? - Michael On 2011-12-13 21:12:33, Alex Newman wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3186/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-12-13 21:12:33) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. ------- bq. bq. PART 1 of hbase-4616 bq. bq. This is an idea that Ryan and I have been kicking around on and off for a while now. bq. bq. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). bq. bq. If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. bq. bq. This issue is about changing the way we name regions. bq. bq. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). bq. bq. Converting to the new method, we'd have to run a migration on startup changing the content in meta. bq. bq. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. bq. bq. bq. This addresses bug HBASE-2600. bq. https://issues.apache.org/jira/browse/HBASE-2600 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 bq. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 bq. src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 bq. src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 bq. src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 bq. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 bq. src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 bq. src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 bq. src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 bq. src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 bq. src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d bq. src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 bq. src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd bq. src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 bq. bq. Diff: https://reviews.apache.org/r/3186/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Alex bq. bq. > Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid > ---------------------------------------------------------------------------------------------------- > > Key: HBASE-2600 > URL: https://issues.apache.org/jira/browse/HBASE-2600 > Project: HBase > Issue Type: Sub-task > Reporter: stack > Assignee: Alex Newman > Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch > > > This is an idea that Ryan and I have been kicking around on and off for a while now. > If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). > If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. > This issue is about changing the way we name regions. > If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). > Converting to the new method, we'd have to run a migration on startup changing the content in meta. > Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira