hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
Date Mon, 17 Dec 2012 19:22:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534191#comment-13534191

stack commented on HBASE-7342:

We should not commit the test for this patch.  It is over-the-top spinning up a cluster to
check a plain array math problem (but thanks for making the test Aleksandr ...)
> Split operation without split key incorrectly finds the middle key in off-by-one error
> --------------------------------------------------------------------------------------
>                 Key: HBASE-7342
>                 URL: https://issues.apache.org/jira/browse/HBASE-7342
>             Project: HBase
>          Issue Type: Bug
>          Components: HFile, io
>    Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
>            Reporter: Aleksandr Shulman
>            Assignee: Aleksandr Shulman
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.4
>         Attachments: 7342-0.94.txt, 7342-trunk-v3.txt, HBASE-7342-v1.patch, HBASE-7342-v2.patch
> I took a deeper look into issues I was having using region splitting when specifying
a region (but not a key for splitting).
> The midkey calculation is off by one and when there are 2 rows, will pick the 0th one.
This causes the firstkey to be the same as midkey and the split will fail. Removing the -1
causes it work correctly, as per the test I've added.
> Looking into the code here is what goes on:
> 1. Split takes the largest storefile
> 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides
as blockKeys[i]
> 3. Getting the middle root-level index should yield the key in the middle of the storefile
> 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset
> 5. In a result with where there are only 2 blockKeys, this yields the 0th block key.

> 6. Unfortunately, this is the same block key that 'firstKey' will be.
> 7. This yields the result in HStore.java:1873 ("cannot split because midkey is the same
as first or last row")
> 8. Removing the -1 solves the problem (in this case). 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message