hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17958) Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP
Date Tue, 25 Apr 2017 08:39:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982567#comment-15982567
] 

Duo Zhang commented on HBASE-17958:
-----------------------------------

Yeah this is really a bad practice. The filter returns SEEK_NEXT_ROW or SEEK_NEXT_COL but
we may still pass the cell of the same row or same column to SQM. We have a strange optimization
in SQM called stickyNextRow(which is really confusing to me when refactoring SQM,,,) so SEEK_NEXT_ROW
usually works, but for SEEK_NEXT_COL there is no such optimization so it is broken...

In fact, if we decide that a skip is better than seek, then we should call heap.next() continuously
until we reach the next row or next column, and then start to call SQM.match again. It is
really confusing that SQM returns SEEK_NEXT_ROW or SEEK_NEXT_COL but it could still receive
the cell from the same row or same column, right?

Thanks.

> Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-17958
>                 URL: https://issues.apache.org/jira/browse/HBASE-17958
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Guanghao Zhang
>
> {code}
> ScanQueryMatcher.MatchCode qcode = matcher.match(cell);
> qcode = optimize(qcode, cell);
> {code}
> The optimize method may change the MatchCode from SEEK_NEXT_COL/SEEK_NEXT_ROW to SKIP.
But it still pass the next cell to ScanQueryMatcher. It will get wrong result when use some
filter, etc. ColumnCountGetFilter. It just count the  columns's number. If pass a same column
to this filter, the count result will be wrong. So we should avoid passing cell to ScanQueryMatcher
when optimize SEEK to SKIP.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message