Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 36D59200C67 for ; Mon, 15 May 2017 22:15:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 35735160BA9; Mon, 15 May 2017 20:15:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7BFBB160BD1 for ; Mon, 15 May 2017 22:15:09 +0200 (CEST) Received: (qmail 39884 invoked by uid 500); 15 May 2017 20:15:08 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 39788 invoked by uid 99); 15 May 2017 20:15:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 May 2017 20:15:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id EBB42C04EF for ; Mon, 15 May 2017 20:15:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id c17Ws1jAafJW for ; Mon, 15 May 2017 20:15:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 2900C5FC3A for ; Mon, 15 May 2017 20:15:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 64BEAE0D69 for ; Mon, 15 May 2017 20:15:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id D61D624343 for ; Mon, 15 May 2017 20:15:04 +0000 (UTC) Date: Mon, 15 May 2017 20:15:04 +0000 (UTC) From: "Jinfeng Ni (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (DRILL-5480) Empty batch returning from HBase may cause SchemChangeException or incorrect query result MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 15 May 2017 20:15:10 -0000 [ https://issues.apache.org/jira/browse/DRILL-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinfeng Ni reassigned DRILL-5480: --------------------------------- Assignee: Jinfeng Ni > Empty batch returning from HBase may cause SchemChangeException or incorrect query result > ----------------------------------------------------------------------------------------- > > Key: DRILL-5480 > URL: https://issues.apache.org/jira/browse/DRILL-5480 > Project: Apache Drill > Issue Type: Bug > Reporter: Jinfeng Ni > Assignee: Jinfeng Ni > > The following repo was provided by [~haozhu]. > 1. Create a Hbase table with 4 regions > {code} > create 'myhbase', 'cf1','cf2', {SPLITS => ['a', 'b', 'c']} > put 'myhbase','a','cf1:col1','somedata' > put 'myhbase','b','cf1:col2','somedata' > put 'myhbase','c','cf2:col1','somedata' > {code} > One region has cf1.col1. One region has column family 'cf1', but does not have 'col1' under 'cf1'. One region has only column family 'cf2'. And last region is complete empty. > 2. Prepare a csv file. > {code} > select * from dfs.tmp.`joinhbase.csv`; > +-------------------+ > | columns | > +-------------------+ > | ["1","somedata"] | > | ["2","somedata"] | > | ["3","somedata"] | > {code} > Now run the following query on drill 1.11.0-SNAPSHOT: > {code} > select cast(H.row_key as varchar(10)) as keyCol, CONVERT_FROM(H.cf1.col1, 'UTF8') as col1 > from > hbase.myhbase H JOIN dfs.tmp.`joinhbase.csv` C > ON CONVERT_FROM(H.cf1.col1, 'UTF8')= C.columns[1] > ; > {code} > The correct query result show be: > {code} > +---------+-----------+ > | keyCol | col1 | > +---------+-----------+ > | a | somedata | > | a | somedata | > | a | somedata | > +---------+-----------+ > {code} > Turn off broadcast join, then we will see SchemaChangeException, or incorrect result randomly. By 'randomly', it means in the same session, the same query would hit SchemaChangeException in one run, while gets incorrect result in a second run. > {code} > alter session set `planner.enable_broadcast_join`=false; > {code} > {code} > select cast(H.row_key as varchar(10)) as keyCol, CONVERT_FROM(H.cf1.col1, 'UTF8') as col1 > . . . . . . . . . . . . . . . . . .> from > . . . . . . . . . . . . . . . . . .> hbase.myhbase H JOIN dfs.tmp.`joinhbase.csv` C > . . . . . . . . . . . . . . . . . .> ON CONVERT_FROM(H.cf1.col1, 'UTF8')= C.columns[1] > . . . . . . . . . . . . . . . . . .> ; > Error: SYSTEM ERROR: SchemaChangeException: Hash join does not support schema changes > {code} > {code} > +---------+-------+ > | keyCol | col1 | > +---------+-------+ > +---------+-------+ > No rows selected (0.302 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)