hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
Date Sun, 25 Mar 2012 20:13:31 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237965#comment-13237965
] 

jiraposter@reviews.apache.org commented on HBASE-2600:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3466/
-----------------------------------------------------------

(Updated 2012-03-25 20:11:32.746962)


Review request for hbase, Michael Stack and Lars Hofhansl.


Summary (updated)
-------

This is an idea that Ryan and I have been kicking around on and off for a while now.

If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables,
doing a search for the region that contains the wanted row, we'd just have to open a scanner
using passed row and the first row found by the scan would be that of the region we need (If
offlined parent, we'd have to scan to the next row).

If we redid the meta tables in this format, we'd be using an access that is natural to hbase,
a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has
to walk backward in meta finding a containing region.

This issue is about changing the way we name regions.

If we were using scans, prewarming client cache would be near costless (as opposed to what
we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore
forward).

Converting to the new method, we'd have to run a migration on startup changing the content
in meta.

Up to this, the randomid component of a region name has been the timestamp of region creation.
HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes"
proposes changing the randomid so that it contains actual name of the directory in the filesystem
that hosts the region. If we had this in place, I think it would help with the migration to
this new way of doing the meta because as is, the region name in fs is a hash of regionname...
changing the format of the regionname would mean we generate a different hash... so we'd need
hbase-2531 to be in place before we could do this change.


    public TRegionInfo getRegionInfo(ByteBuffer searchRow) throws IOError { was nulled out
and enabled with https://reviews.apache.org/r/3514/. They are listed as dependencies in the
jira and will be committed together.


This addresses bug HBASE-2600.
    https://issues.apache.org/jira/browse/HBASE-2600


Diffs (updated)
-----

  security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java c1f20de

  src/main/java/org/apache/hadoop/hbase/HConstants.java 8888347 
  src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 8d83ff3 
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java fc5e53e 
  src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f 
  src/main/java/org/apache/hadoop/hbase/catalog/MetaMigratev2.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java 0129ee9 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 16e4017 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b2a5463 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java 8e7d7f7 
  src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 04150ad 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 47381f4 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f404999 
  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 197eb71 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 18c13c4 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 30c61ca 
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 757f98e 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java dbc9251 
  src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 2ec6677

  src/main/java/org/apache/hadoop/hbase/migration/HRegionInfo090x2.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 8174cf5

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 02d55d4 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e0af8fb 
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 0592f40 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 0c7b396 
  src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 56e31e1 
  src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 3535595 
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java 60eb426 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/AlreadyExists.java a5b81f5 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/BatchMutation.java d5df940 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/ColumnDescriptor.java 4ce85e7 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 6c505c0 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/IOError.java 11e31e3 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/IllegalArgument.java ede215f 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/Mutation.java ef1817f 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/TCell.java 6ee8ca7 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/TRegionInfo.java ed251e8 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/TRowResult.java e1709b5 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/TScan.java f7cc05d 
  src/main/java/org/apache/hadoop/hbase/util/FSUtils.java aebe5b0 
  src/main/java/org/apache/hadoop/hbase/util/Writables.java 3d20723 
  src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift f698a6c 
  src/test/data/generate-hbase-2600-root-in-tmp.sh PRE-CREATION 
  src/test/data/hbase-2600-root.dir.tgz PRE-CREATION 
  src/test/data/hbase-4388-root.dir.tgz da2244e8097d3fd3b0cb04d49cbc615406f7e809 
  src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaUpdate.java PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java f7430ee 
  src/test/java/org/apache/hadoop/hbase/client/TestMetaMigrationRemovingHTD.java d1c15af 
  src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 
  src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java d2b3060 
  src/test/java/org/apache/hadoop/hbase/migration/TestMigration.java PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/migration/TestMigrationFrom090To092.java c3651ac 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6dfba41 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab 
  src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 
  src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 

Diff: https://reviews.apache.org/r/3466/diff


Testing
-------

Unit tests started table. 


Tests in error: 
  org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched
for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,,
stopRow lastChar's int value: 35 with parentTable:.META.

I need to know how to update/recreate the tar ball which is the source for that test.


Thanks,

Alex


                
> Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2600
>                 URL: https://issues.apache.org/jira/browse/HBASE-2600
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Alex Newman
>         Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch,
0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch,
0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch,
0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1,
0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch,
0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 2600-trunk-01-17.txt, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, then in the
metatables, doing a search for the region that contains the wanted row, we'd just have to
open a scanner using passed row and the first row found by the scan would be that of the region
we need (If offlined parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is natural to
hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have
that has to walk backward in meta finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as opposed to
what we'll currently have to do which is first a getClosestRowBefore and then a scan from
the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup changing the content
in meta.
> Up to this, the randomid component of a region name has been the timestamp of region
creation.   HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash
clashes" proposes changing the randomid so that it contains actual name of the directory in
the filesystem that hosts the region.  If we had this in place, I think it would help with
the migration to this new way of doing the meta because as is, the region name in fs is a
hash of regionname... changing the format of the regionname would mean we generate a different
hash... so we'd need hbase-2531 to be in place before we could do this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message