Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 099A610F64 for ; Tue, 30 Jul 2013 03:30:07 +0000 (UTC) Received: (qmail 5126 invoked by uid 500); 30 Jul 2013 03:29:57 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 4934 invoked by uid 500); 30 Jul 2013 03:29:54 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 4908 invoked by uid 99); 30 Jul 2013 03:29:51 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jul 2013 03:29:51 +0000 Date: Tue, 30 Jul 2013 03:29:51 +0000 (UTC) From: "Ted Yu (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-9079) FilterList getNextKeyHint skips rows that should be included in the results MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723343#comment-13723343 ] Ted Yu commented on HBASE-9079: ------------------------------- w.r.t. ordering, there is no method in FilterBase which can tell us whether the hint provider operates at row or column level. For 0.96 / trunk, we may add such a method so that FilterList can (re)order the Filters accordingly. For 0.94, we can provide documentation on this aspect so that user can register Filters in correct order. bq. you mean against the 0.95/0.96 tree ? Yes. I meant patch against trunk. Thanks > FilterList getNextKeyHint skips rows that should be included in the results > --------------------------------------------------------------------------- > > Key: HBASE-9079 > URL: https://issues.apache.org/jira/browse/HBASE-9079 > Project: HBase > Issue Type: Bug > Components: Filters > Affects Versions: 0.94.10 > Reporter: Viral Bajaria > Attachments: TestFail.patch, TestSuccess.patch > > > I hit a weird issue/bug and am able to reproduce the error consistently. The problem arises when FilterList has two filters where each implements the getNextKeyHint method. > The way the current implementation works is, StoreScanner will call matcher.getNextKeyHint() whenever it gets a SEEK_NEXT_USING_HINT. This in turn will call filter.getNextKeyHint() which at this stage is of type FilterList. The implementation in FilterList iterates through all the filters and keeps the max KeyValue that it sees. All is fine if you wrap filters in FilterList in which only one of them implements getNextKeyHint. but if multiple of them implement then that's where things get weird. > For example: > - create two filters: one is FuzzyRowFilter and second is ColumnRangeFilter. Both of them implement getNextKeyHint > - wrap them in FilterList with MUST_PASS_ALL > - FuzzyRowFilter will seek to the correct first row and then pass it to ColumnRangeFilter which will return the SEEK_NEXT_USING_HINT code. > - Now in FilterList when getNextKeyHint is called, it calls the one on FuzzyRow first which basically says what the next row should be. While in reality we want the ColumnRangeFilter to give the seek hint. > - The above behavior skips data that should be returned, which I have verified by using a RowFilter with RegexStringComparator. > I updated the FilterList to maintain state on which filter returns the SEEK_NEXT_USING_HINT and in getNextKeyHint, I invoke the method on the saved filter and reset that state. I tested it with my current queries and it works fine but I need to run the entire test suite to make sure I have not introduced any regression. In addition to that I need to figure out what should be the behavior when the opeation is MUST_PASS_ONE, but I doubt it should be any different. > Is my understanding of it being a bug correct ? Or am I trivializing it and ignoring something very important ? If it's tough to wrap your head around the explanation, then I can open a JIRA and upload a patch against 0.94 head. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira