Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 26 Nov 2014 09:53:12 +0000 (UTC)
From: "rajeshbabu (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12757912.1416985172000.25045.1416995592941@Atlassian.JIRA>
In-Reply-To: <JIRA.12757912.1416985172000@Atlassian.JIRA>
References: <JIRA.12757912.1416985172000@Atlassian.JIRA>
 <JIRA.12757912.1416985172858@arcas>
Subject: [jira] [Commented] (HBASE-12583) Allow creating reference files
 even the split row not lies in the storefile range if required
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-12583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225965#comment-14225965 ] 

rajeshbabu commented on HBASE-12583:
------------------------------------

bq.So, if a split key is bigger or smaller than storefile, we don't want to split the storefile; the file goes to the left or right of the split point; a split point that is not in a storefile is fine.
Absolutely correct [~stack].
bq. They are companion regions? 
Yes even after split the data regions key ranges and index region key ranges will be same.
bq. Can't you split them by passing in a pertinent split key, one related to that of the primary region but adapted for the companion region? 
The index storefile the data is sorted by column value order and data row key suffixed at the end of index rowkey, we cannot find the exact split key for the index from the actual split point.
bq. Are you passing in 'wrong' key, the split key for primary region?
Yes. while rewriting corresponding half of the storefile to daughter regions we parse the data rowkey(from index rowkey) and compare with  actual split row to decide it's left or right part of split point.
bq. This stuff used to work for you but now the checks are more stringent, it breaks you?
Yes. It's used to work in 0.94.

> Allow creating reference files even the split row not lies in the storefile range if required
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-12583
>                 URL: https://issues.apache.org/jira/browse/HBASE-12583
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: rajeshbabu
>            Assignee: rajeshbabu
>              Labels: Phoenix
>             Fix For: 2.0.0, 0.98.9, 0.99.2
>
>
> Currently in HRegionFileSystem#splitStoreFile we are not creating reference files if the split row not lies in the storefile range that means one of the child region doesn't have any data.
> {code}
>    // Check whether the split row lies in the range of the store file
>     // If it is outside the range, return directly.
>     if (top) {
>       //check if larger than last key.
>       KeyValue splitKey = KeyValueUtil.createFirstOnRow(splitRow);
>       byte[] lastKey = f.createReader().getLastKey();
>       // If lastKey is null means storefile is empty.
>       if (lastKey == null) return null;
>       if (f.getReader().getComparator().compareFlatKey(splitKey.getBuffer(),
>           splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0, lastKey.length) > 0) {
>         return null;
>       }
>     } else {
>       //check if smaller than first key
>       KeyValue splitKey = KeyValueUtil.createLastOnRow(splitRow);
>       byte[] firstKey = f.createReader().getFirstKey();
>       // If firstKey is null means storefile is empty.
>       if (firstKey == null) return null;
>       if (f.getReader().getComparator().compareFlatKey(splitKey.getBuffer(),
>           splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0, firstKey.length) < 0) {
>         return null;
>       }
>     }
> {code}
> In some cases when split row should be compared with part of rowkey(in composite rowkey) mainly for secondary index tables we need to create reference files even when split row not lies in the storefile range so that they can be rewritten to it's child regions by some custom half store file reader which compare the part of row key with split row.
> The check of comparing split row with storefile range and returning directly can be avoided by having special boolean attribute in table descriptor when it set to true. Or else we can have coprocessor hooks so that in the hooks we can create the references and bypass.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)