hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2531) 32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes
Date Tue, 18 May 2010 05:10:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12868534#action_12868534

stack commented on HBASE-2531:

@Kannan As I see it, we just need to make sure that the system can work with both styles of
naming; that it reads in the old stuff without issue and that at same time, for any new region
created, we should use the UUID form writing new region directory names.  Somehow, we also
need to drop this notion of encoding the region name.  Going forward it will not be needed
since the UUID will actually be part of the region name.

I agree that getting zk in the mix or even hdfs for that matter making region dirctory names
complicates something that could be real simple if we use UUIDs.

> 32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes
> ----------------------------------------------------------------------------
>                 Key: HBASE-2531
>                 URL: https://issues.apache.org/jira/browse/HBASE-2531
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.21.0
> Kannan tripped over two regionnames that hashed the same:
> Here is code demo'ing that his two names hash the same:
> {code}
> package org;
> import org.apache.hadoop.hbase.util.Bytes;
> import org.apache.hadoop.hbase.util.JenkinsHash;
> public class Testing {
>   public static void main(final String [] args) {
>     System.out.println(encodeRegionName(Bytes.toBytes("test1,6838000000,1273541236167")));
>     System.out.println(encodeRegionName(Bytes.toBytes("test1,0520100000,1273541610201")));
>   }
>   /**
>    * @param regionName
>    * @return the encodedName
>    */
>   public static int encodeRegionName(final byte [] regionName) {
>     return Math.abs(JenkinsHash.getInstance().hash(regionName, regionName.length, 0));
>   }
> }
> {code}
> Need new encoding mechanism.  Will need to migrate old regions to new schema.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message