Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EC17318F4B for ; Mon, 10 Aug 2015 21:32:45 +0000 (UTC) Received: (qmail 97105 invoked by uid 500); 10 Aug 2015 21:32:45 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 97073 invoked by uid 500); 10 Aug 2015 21:32:45 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 97020 invoked by uid 99); 10 Aug 2015 21:32:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Aug 2015 21:32:45 +0000 Date: Mon, 10 Aug 2015 21:32:45 +0000 (UTC) From: "Jinfeng Ni (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-3621) Wrong results when Drill on Hbase query contains rowkey "or" or "IN" MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680800#comment-14680800 ] Jinfeng Ni commented on DRILL-3621: ----------------------------------- I think the incorrect query result is caused by the fact that HBaseScanSpec's stopRow is exclusive, not inclusive. The stopRow should be some value which is larger than "DUMMY10". public Scan(byte[] startRow, byte[] stopRow) Create a Scan operation for the range of rows specified. Parameters: startRow - row to start scanner at or after (inclusive) stopRow - row to stop scanner before (exclusive) > Wrong results when Drill on Hbase query contains rowkey "or" or "IN" > -------------------------------------------------------------------- > > Key: DRILL-3621 > URL: https://issues.apache.org/jira/browse/DRILL-3621 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.1.0 > Reporter: Hao Zhu > Assignee: Chris Westin > Priority: Critical > > If Drill on Hbase query contains row_key "in" or "or", it produces wrong results. > For example: > 1. Create a hbase table > {code} > create 'testrowkey','cf' > put 'testrowkey','DUMMY1','cf:c','value1' > put 'testrowkey','DUMMY2','cf:c','value2' > put 'testrowkey','DUMMY3','cf:c','value3' > put 'testrowkey','DUMMY4','cf:c','value4' > put 'testrowkey','DUMMY5','cf:c','value5' > put 'testrowkey','DUMMY6','cf:c','value6' > put 'testrowkey','DUMMY7','cf:c','value7' > put 'testrowkey','DUMMY8','cf:c','value8' > put 'testrowkey','DUMMY9','cf:c','value9' > put 'testrowkey','DUMMY10','cf:c','value10' > {code} > 2. Drill queries: > {code} > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> SELECT CONVERT_FROM(ROW_KEY,'UTF8') RK FROM hbase.testrowkey T WHERE ROW_KEY = 'DUMMY10'; > +----------+ > | RK | > +----------+ > | DUMMY10 | > +----------+ > 1 row selected (1.186 seconds) > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> SELECT CONVERT_FROM(ROW_KEY,'UTF8') RK FROM hbase.testrowkey T WHERE ROW_KEY = 'DUMMY1'; > +---------+ > | RK | > +---------+ > | DUMMY1 | > +---------+ > 1 row selected (0.691 seconds) > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> SELECT CONVERT_FROM(ROW_KEY,'UTF8') RK FROM hbase.testrowkey T WHERE ROW_KEY IN ('DUMMY1' , 'DUMMY10'); > +---------+ > | RK | > +---------+ > | DUMMY1 | > +---------+ > 1 row selected (0.71 seconds) > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> SELECT CONVERT_FROM(ROW_KEY,'UTF8') RK FROM hbase.testrowkey T WHERE ROW_KEY ='DUMMY1' OR ROW_KEY = 'DUMMY10'; > +---------+ > | RK | > +---------+ > | DUMMY1 | > +---------+ > 1 row selected (0.693 seconds) > {code} > From explain plan, filter is pushed down to hbase scan layer. > {code} > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> explain plan for SELECT CONVERT_FROM(ROW_KEY,'UTF8') RK FROM hbase.testrowkey T WHERE ROW_KEY IN ('DUMMY1' , 'DUMMY10'); > +------+------+ > | text | json | > +------+------+ > | 00-00 Screen > 00-01 Project(RK=[CONVERT_FROMUTF8($0)]) > 00-02 Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=testrowkey, startRow=DUMMY1, stopRow=DUMMY10, filter=null], columns=[`row_key`]]]) > | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)