phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chenglei (JIRA)" <>
Subject [jira] [Created] (PHOENIX-3705) SkipScanFilter may repeatedly copy rowKey Columns to startKey
Date Wed, 01 Mar 2017 10:01:45 GMT
chenglei created PHOENIX-3705:

             Summary: SkipScanFilter may repeatedly copy rowKey Columns to startKey
                 Key: PHOENIX-3705
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.9.0
            Reporter: chenglei

See following simple unit test first,the rowKey is composed of three PInteger columns,and
the slots of SkipScanFilter are:
[[[1 - 4]], [5, 7], [[9 - 10]]],
When SkipScanFilter.filterKeyValue method is invoked on a KeyValue whose rowKey is [2,7,11],
obviously SkipScanFilter.filterKeyValue
returns ReturnCode.SEEK_NEXT_USING_HINT and SkipScanFilter.getNextCellHint returns  [3,5,9],
but unfortunately, SkipScanFilter.getNextCellHint actually returns  [2,8,5,9] , a very strange
value, the unit tests failed.
    public void testNavigate() {
        RowKeySchemaBuilder builder = new RowKeySchemaBuilder(3);
        for(int i=0;i<3;i++) {
                    new PDatum() {

                        public boolean isNullable() {
                            return false;

                        public PDataType getDataType() {
                            return PInteger.INSTANCE;

                        public Integer getMaxLength() {
                            return PInteger.INSTANCE.getMaxLength(null);

                        public Integer getScale() {
                            return PInteger.INSTANCE.getScale(null);

                        public SortOrder getSortOrder() {
                            return SortOrder.getDefault();

                    }, false, SortOrder.getDefault());
        List<List<KeyRange>> rowKeyColumnRangesList=Arrays.asList(      
                    PInteger.INSTANCE.getKeyRange(PInteger.INSTANCE.toBytes(1), true, PInteger.INSTANCE.toBytes(4),
                    PInteger.INSTANCE.getKeyRange(PInteger.INSTANCE.toBytes(9), true, PInteger.INSTANCE.toBytes(10),
        SkipScanFilter skipScanFilter=new SkipScanFilter(rowKeyColumnRangesList,;
        byte[] rowKey=ByteUtil.concat(
        KeyValue keyValue=KeyValue.createFirstOnRow(rowKey);
        ReturnCode returnCode=skipScanFilter.filterKeyValue(keyValue);
        assertTrue(returnCode == ReturnCode.SEEK_NEXT_USING_HINT);
        Cell nextCellHint=skipScanFilter.getNextCellHint(keyValue);

Let us see what's wrong, first column of rowKey [2,7,11 ] is 2, which is in SkipScanFilter's
first slot range [1-4], so position[0] is 0 and we go to the second column 7, which match
the second range [7] of SkipScanFilter's second slot [5, 7],so position[1] is 1 and we go
to the third column 11, which is bigger than the third slot range [9 - 10],so position[2]
is 0 and we begin to backtrack to second column, because the second range [7] of SkipScanFilter's
second slot is singleKey and there is no more range,so position[1] is 0 and we continue to
backtrack to first column, because the first slot range [1-4] is not singleKey so we stop
backtracking at first column.

Now the problem comes, in following line 448 of SkipScanFilter.navigate method,SkipScanFilter.setStartKey
method is invoked,first copy rowKey columns before ptr to SkipScanFilter.startKey, because
now ptr still point to the third column, so copy the first and second columns to SkipScanFilter.startKey,SkipScanFilter.startKey
is [2,7]  after this step , then setStartKey method copy the lower bound SkipScanFilter.slots
from j+1 column, accoring to SkipScanFilter.position array,now j is 0, and both position[1]
and position[2] are 0,so SkipScanFilter.startKey becomes [2,7,5,9], and in following line
457, ByteUtil.nextKey is invoked on [2,7], [2,7] is incremented to [2,8], finally SkipScanFilter.startKey
is [2,8,5,9].

448                       int currentLength = setStartKey(ptr, minOffset, j+1, nSlots, false);
449                    // From here on, we use startKey as our buffer (resetting minOffset
and maxOffset)
450                   // We've copied the part of the current key above that we need into
451                    // Reinitialize the iterator to be positioned at previous slot position
452                    minOffset = 0;
453                    maxOffset = startKeyLength;
454                    schema.iterator(startKey, minOffset, maxOffset, ptr, ScanUtil.getRowKeyPosition(slotSpan,
455                    // Do nextKey after setting the accessor b/c otherwise the null byte
may have
456                    // been incremented causing us not to find it
457                    ByteUtil.nextKey(startKey, currentLength);

>From above analysis, we can see the second column is repeatedly copied to SkipScanFilter.startKey,
copy 7 to SkipScanFilter.startKey is error ,before SkipScanFilter.setStartKey method is invoked
in line 448, we should first move ptr to j+1, that is to say ,following code might be inserted
before line 448:
                            ScanUtil.getRowKeyPosition(slotSpan, i), 
                            ScanUtil.getRowKeyPosition(slotSpan, j+1), 
 Another problem is why the existing SkipScanFilter unit tests are ok? That is because in
all existing unit test cases, backtrack column is just before current column which ptr points
to,  i.e. i==j+1, and this problem may not affect the final query result , but obviously,
it may lead to the erroneous startKey and  cause the Skip Scan inefficient.

This message was sent by Atlassian JIRA

View raw message