hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Esteban Gutierrez (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-17755) CellBasedKeyBlockIndexReader#midkey should exhaust search of the target middle key on skewed regions
Date Wed, 08 Mar 2017 00:04:38 GMT
Esteban Gutierrez created HBASE-17755:
-----------------------------------------

             Summary: CellBasedKeyBlockIndexReader#midkey should exhaust search of the target
middle key on skewed regions
                 Key: HBASE-17755
                 URL: https://issues.apache.org/jira/browse/HBASE-17755
             Project: HBase
          Issue Type: Bug
          Components: HFile
            Reporter: Esteban Gutierrez
            Assignee: Esteban Gutierrez


We have always been returning the middle key of the the block index regardless the distribution
of the data on an HFile. A side effect of that approach is that when millions of rows share
the same key its quite easy to run into a situation when the start key is equal to the middle
key or when the end key is equal to the middle key making that HFile nearly impossible to
split until enough data is written into the region and the middle key shifts to another row
or when an operator uses a custom split point in order to split that region. 

Instead we should exhaust the search of the middle key in the block index in order to be able
to split an HFile earlier when possible even if our edge case is to serve a region that could
hold a single key with millions of versions of a row or with millions of qualifiers on the
same row.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message