hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2531) 32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes
Date Sun, 30 May 2010 00:11:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873347#action_12873347
] 

HBase Review Board commented on HBASE-2531:
-------------------------------------------

Message from: "Kannan Muthukkaruppan" <kannan@facebook.com>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/104/#review102
-----------------------------------------------------------



trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
<http://review.hbase.org/r/104/#comment601>

    will do.



trunk/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
<http://review.hbase.org/r/104/#comment602>

    old style region names don't have their encoded name in the regionNameStr. So I check
for that here, and append the encoded name for those regions so that the logs, web UI, etc.
will display the encoded named for those regions.



trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
<http://review.hbase.org/r/104/#comment603>

    will get rid of all whitespace diffs.



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.hbase.org/r/104/#comment604>

    Will get rid of all the whitespace diffs. After the big whitespace cleanup landed in 0.20,
I decided to set my eclipse to kill trailing whitespaces. But looks like trunk still has a
bunch of whitespaces.



trunk/src/main/java/org/apache/hadoop/hbase/util/MD5Hash.java
<http://review.hbase.org/r/104/#comment605>

    ok...


- Kannan





> 32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-2531
>                 URL: https://issues.apache.org/jira/browse/HBASE-2531
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Kannan Muthukkaruppan
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2531_v2.patch
>
>
> Kannan tripped over two regionnames that hashed the same:
> Here is code demo'ing that his two names hash the same:
> {code}
> package org;
> import org.apache.hadoop.hbase.util.Bytes;
> import org.apache.hadoop.hbase.util.JenkinsHash;
> public class Testing {
>   public static void main(final String [] args) {
>     System.out.println(encodeRegionName(Bytes.toBytes("test1,6838000000,1273541236167")));
>     System.out.println(encodeRegionName(Bytes.toBytes("test1,0520100000,1273541610201")));
>   }
>   /**
>    * @param regionName
>    * @return the encodedName
>    */
>   public static int encodeRegionName(final byte [] regionName) {
>     return Math.abs(JenkinsHash.getInstance().hash(regionName, regionName.length, 0));
>   }
> }
> {code}
> Need new encoding mechanism.  Will need to migrate old regions to new schema.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message