hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Esteban Gutierrez (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18987) Raise value of HConstants#MAX_ROW_LENGTH
Date Thu, 12 Oct 2017 14:49:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202042#comment-16202042

Esteban Gutierrez commented on HBASE-18987:

bq. Not in testOversizedRegionNameForPut 
[~mdrob]: thanks!

bq. The value length can be upto Integer.MAX_VALUE - 1 as we use 4 bytes to store that. But
for the row length it is 2 bytes right? Then allowing Integer.MAX_VALUE - 1 for RK length
also correct?

[~anoopsamjohn]: yeah, you are right. The problem seems to run deeper: The KeyValue constructor
accepts an integer for rlength but there are few more places where we only use a short: {{createEmptyByteArray}}
will test if rlength is greater than {{Short.MAX_VALUE}} and {{rowLen}} on {{KeyOnlyKeyValue}}
is a short. Also {{KEYVALUE_INFRASTRUCTURE_SIZE}} depends on ROW_LENGTH_SIZE which is {{Bytes.SIZEOF_SHORT}}
My test didn't catch that since you need to go all the way to serialize the KV.

I think I'm -1 now for this and the truncate approach might be the only alternative for now.

> Raise value of HConstants#MAX_ROW_LENGTH
> ----------------------------------------
>                 Key: HBASE-18987
>                 URL: https://issues.apache.org/jira/browse/HBASE-18987
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 1.0.0, 2.0.0
>            Reporter: Esteban Gutierrez
>            Assignee: Esteban Gutierrez
>            Priority: Minor
>         Attachments: HBASE-18987.master.001.patch, HBASE-18987.master.002.patch
> Short.MAX_VALUE hasn't been a problem for a long time but one of our customers ran into
an  edgy case when the midKey used for the split point was very close to Short.MAX_VALUE.
When the split is submitted, we attempt to create the new two daughter regions and we name
those regions via {{HRegionInfo.createRegionName()}} in order to be added to META. Unfortunately,
since {{HRegionInfo.createRegionName()}} uses midKey as the startKey {{Put}} will fail since
the row key length will now fail checkRow and thus causing the split to fail.
> I tried a couple of alternatives to address this problem, e.g. truncating the startKey.
But the number of changes in the code doesn't justify for this edge condition. Since we already
use {{Integer.MAX_VALUE - 1}} for {{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use
the same limit for the maximum row key. 

This message was sent by Atlassian JIRA

View raw message