Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BD96C200B35 for ; Tue, 21 Jun 2016 03:52:45 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BBF8B160A65; Tue, 21 Jun 2016 01:52:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B71A5160A55 for ; Tue, 21 Jun 2016 03:52:44 +0200 (CEST) Received: (qmail 36166 invoked by uid 500); 21 Jun 2016 01:52:43 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Delivered-To: moderator for dev@drill.apache.org Received: (qmail 32116 invoked by uid 99); 21 Jun 2016 01:48:24 -0000 X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=l62v+2B1hQxqtP2Gfh8vdKg/EO5MBAGcBq4v/KFZ1c0=; b=bFOz118XCUhsIk8Y/SfOqr6GDwzRPy70wFJKq7Ql3W8wSJGblOunAqUMLBMyHHrZZH SNomoqFN6Y1VNG9xtwcuz+8rJrKXB23XZ6iMODYH/dZY1jiqXTXexy0eRDRaHA54saIV KaSSBTwY1PWxEiMtmH8Q9k+cVMc7ZJdE0tnBX51uFF//GSumi8oE1vUAhhD3MDeJ1Mgt g4c+m7mR2rADm1j3tmTVBIIxFXZwPAM5n0ViIheqKW31dAlrlRwp8JVwwPyg/NtGnytu iH7WmgO8iM56iunPhkjey4+hBA4HZomWrWEt/v0HDYR54Xjkgea8/kfhDJaSzzqbtcAr JQyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=l62v+2B1hQxqtP2Gfh8vdKg/EO5MBAGcBq4v/KFZ1c0=; b=MX9YEuz3lJW5EigmIXbuOg14HgG8XfT14YcH03mAr/UU9ilz5Ay1TAUNxxo3/cHQ2Y eP0ZE3T9b2Lk6AltiP6K/rSAz9ZY5YJNt0pRg8d1qljEBOZFL/IvzemA9OBuJCDHLT+e vTOr6BM2mthZLXNqJIpcejytgV1xr5M/uuW7BkwGVXjMrmEe2hUhhVlThRJdAeHOIlcq M+3SgIQx8/sweHCH9BdHCQtHdhCuyl8kIyvGgo4X98CaV/87DZQ1wwrjMA5rFZd3yM64 tGtTnmkzUnuVL8son1NPb17m32uj2KFim4dPryK3XUIVHHQ7o/Yv+GeLxjOIDHdQlgXf jrhA== X-Gm-Message-State: ALyK8tKa1gs0GrsWOdbLQ4FWqzcfLSNRZVnpV3IdZbmaATfJYpHrzijRI18thJpU0xBUcmzFp9dD2nkNLcCAgQ== X-Received: by 10.176.4.68 with SMTP id 62mr8197570uav.94.1466473698535; Mon, 20 Jun 2016 18:48:18 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: qiang li Date: Tue, 21 Jun 2016 09:48:17 +0800 Message-ID: Subject: Re: DRILL-4199: Add Support for HBase 1.X - planning to merge To: user Cc: Aditya Kishore , dev Content-Type: multipart/alternative; boundary=94eb2c000e18dfac080535c003d6 archived-at: Tue, 21 Jun 2016 01:52:45 -0000 --94eb2c000e18dfac080535c003d6 Content-Type: text/plain; charset=UTF-8 Hi Aman, I did not fully test with the old version. Cloud you please help me create the JIRA issue, I think my account have not the privilege, my account is griffinli and can not find the place to create new issue. Below is the explain detail for the same SQL in different nodes of cluster. This is the correct plan which only have two nodes: 0: jdbc:drill:zk=xxx:> explain plan for select CONVERT_FROM(BYTE_SUBSTR(`ref0`.row_key,-8,8),'BIGINT_BE') as uid, convert_from(`ref0`.`v`.`v`,'UTF8') as v from hbase.`offers_nation_idx` as `nation` join hbase.offers_ref0 as `ref0` on BYTE_SUBSTR(`ref0`.row_key,-8,8) = nation.`v`.`v` where `nation`.row_key > '0br' and `nation`.row_key < '0bs' limit 10; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(uid=[$0], v=[$1]) 00-02 SelectionVectorRemover 00-03 Limit(fetch=[10]) 00-04 UnionExchange 01-01 SelectionVectorRemover 01-02 Limit(fetch=[10]) 01-03 Project(uid=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($3, -8, 8))], v=[CONVERT_FROMUTF8(ITEM($4, 'v'))]) 01-04 Project(row_key=[$3], v=[$4], ITEM=[$5], row_key0=[$0], v0=[$1], $f2=[$2]) 01-05 HashJoin(condition=[=($2, $5)], joinType=[inner]) 01-07 Project(row_key=[$0], v=[$1], $f2=[BYTE_SUBSTR($0, -8, 8)]) 01-09 Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=offers_ref0, startRow=null, stopRow=null, filter=null], columns=[`*`]]]) 01-06 Project(row_key0=[$0], v0=[$1], ITEM=[$2]) 01-08 *BroadcastExchange* 02-01 Project(row_key=[$0], v=[$1], ITEM=[ITEM($1, 'v')]) 02-02 Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=offers_nation_idx, startRow=0br\x00, stopRow=0bs, filter=FilterList AND (2/2): [RowFilter (GREATER, 0br), RowFilter (LESS, 0bs)]], columns=[`row_key`, `v`, `v`.`v`]]]) This is the plan that fails which have more than 5 nodes: 0: jdbc:drill:zk=xxx:> explain plan for select CONVERT_FROM(BYTE_SUBSTR(`ref0`.row_key,-8,8),'BIGINT_BE') as uid, convert_from(`ref0`.`v`.`v`,'UTF8') as v from hbase.`offers_nation_idx` as `nation` join hbase.offers_ref0 as `ref0` on BYTE_SUBSTR(`ref0`.row_key,-8,8) = nation.`v`.`v` where `nation`.row_key > '0br' and `nation`.row_key < '0bs' limit 10; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(uid=[$0], v=[$1]) 00-02 SelectionVectorRemover 00-03 Limit(fetch=[10]) 00-04 UnionExchange 01-01 SelectionVectorRemover 01-02 Limit(fetch=[10]) 01-03 Project(uid=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($3, -8, 8))], v=[CONVERT_FROMUTF8(ITEM($4, 'v'))]) 01-04 Project(row_key=[$3], v=[$4], ITEM=[$5], row_key0=[$0], v0=[$1], $f2=[$2]) 01-05 HashJoin(condition=[=($2, $5)], joinType=[inner]) 01-07 Project(row_key=[$0], v=[$1], $f2=[$2]) 01-09 *HashToRandomExchange*(dist0=[[$2]]) 02-01 UnorderedMuxExchange 04-01 Project(row_key=[$0], v=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2)]) 04-02 Project(row_key=[$0], v=[$1], $f2=[BYTE_SUBSTR($0, -8, 8)]) 04-03 Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=offers_ref0, startRow=null, stopRow=null, filter=null], columns=[`*`]]]) 01-06 Project(row_key0=[$0], v0=[$1], ITEM=[$2]) 01-08 Project(row_key=[$0], v=[$1], ITEM=[$2]) 01-10 *HashToRandomExchange*(dist0=[[$2]]) 03-01 UnorderedMuxExchange 05-01 Project(row_key=[$0], v=[$1], ITEM=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2)]) 05-02 Project(row_key=[$0], v=[$1], ITEM=[ITEM($1, 'v')]) 05-03 Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=offers_nation_idx, startRow=0br\x00, stopRow=0bs, filter=FilterList AND (2/2): [RowFilter (GREATER, 0br), RowFilter (LESS, 0bs)]], columns=[`row_key`, `v`, `v`.`v`]]]) The difference is use *BroadcastExchange *and *HashToRandomExchange.* You can create the JIRA and send me the link . Thanks. 2016-06-20 23:44 GMT+08:00 Aman Sinha : > Hi Qiang, > were you seeing this same issue with the prior HBase version also ? (I > would think this is not a regression). It would be best to create a new > JIRA and attach the EXPLAIN plans for the successful and failed runs. With > more nodes some minor fragments of the hash join may be getting empty input > batches and I am guessing that has something to do with the > SchemaChangeException. Someone would need to debug once you create the > JIRA with relevant details. > > -Aman > > On Mon, Jun 20, 2016 at 5:13 AM, qiang li wrote: > > > Thanks Aditya. > > > > By the way, I found another issue. > > > > Let say I have two tables. > > > > offers_ref0 : rowkey salt(1byte)+long uid(8 byte ) , family: v, > qualifier: > > v(string) > > offers_nation_idx: rowkey salt(1byte) + string, family:v, qualifier: > v(long > > 8 byte) > > > > there is the SQL: > > > > select CONVERT_FROM(BYTE_SUBSTR(`ref0`.row_key,-8,8),'BIGINT_BE') as uid, > > convert_from(`ref0`.`v`.`v`,'UTF8') as v from hbase.`offers_nation_idx` > as > > `nation` join hbase.offers_ref0 as `ref0` on > > CONVERT_FROM(BYTE_SUBSTR(`ref0`.row_key,-8,8),'BIGINT_BE') = > > CONVERT_FROM(nation.`v`.`v`,'BIGINT_BE') where `nation`.row_key > '0br' > > and `nation`.row_key < '0bs' limit 10 > > > > When I execute the query with single node or less than 5 nodes, its > working > > good. But when I execute it in cluster which have about 14 nodes, its > throw > > a exception: > > > > First time will throw this exception: > > *Caused by: java.sql.SQLException: SYSTEM ERROR: SchemaChangeException: > > Hash join does not support schema changes* > > > > Then if I query again, it will always throw below exception: > > *Query Failed: An Error Occurred* > > *org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > > IllegalStateException: Failure while reading vector. Expected vector > class > > of org.apache.drill.exec.vector.NullableIntVector but was holding vector > > class org.apache.drill.exec.vector.complex.MapVector, field= > > v(MAP:REQUIRED)[v(VARBINARY:OPTIONAL)[$bits$(UINT1:REQUIRED), > > v(VARBINARY:OPTIONAL)[$offsets$(UINT4:REQUIRED)]]] Fragment 12:4 [Error > Id: > > 06c6eae4-0822-4714-b0bf-a6e04ebfec79 on xxx:31010]* > > > > Its very strange, and I do not know how to solve it. > > I tried add node to the cluster one by one, it will reproduce when I > added > > 5 nodes. Can anyone help me solve this issue? > > > > > > > > > > 2016-06-17 4:39 GMT+08:00 Aditya : > > > > > https://issues.apache.org/jira/browse/DRILL-4727 > > > > > > On Thu, Jun 16, 2016 at 11:39 AM, Aman Sinha > > wrote: > > > > > >> Qiang/Aditya can you create a JIRA for this and mark it for 1.7. > > thanks. > > >> > > >> On Thu, Jun 16, 2016 at 11:25 AM, Aditya > > wrote: > > >> > > >> > Thanks for reporting, I'm looking into it and will post a patch > soon. > > >> > > > >> > On Wed, Jun 15, 2016 at 7:27 PM, qiang li > > wrote: > > >> > > > >> > > Hi Aditya, > > >> > > > > >> > > I tested the latest version and got this exception and the > drillbit > > >> fail > > >> > > to startup . > > >> > > > > >> > > Exception in thread "main" java.lang.NoSuchMethodError: > > >> > > io.netty.util.UniqueName.(Ljava/lang/String;)V > > >> > > at > > >> io.netty.channel.ChannelOption.(ChannelOption.java:136) > > >> > > at > > >> io.netty.channel.ChannelOption.valueOf(ChannelOption.java:99) > > >> > > at > > >> io.netty.channel.ChannelOption.(ChannelOption.java:42) > > >> > > at > > >> > > org.apache.drill.exec.rpc.BasicServer.(BasicServer.java:63) > > >> > > at > > >> > > > org.apache.drill.exec.rpc.user.UserServer.(UserServer.java:74) > > >> > > at > > >> > > > > >> > > org.apache.drill.exec.service.ServiceEngine.(ServiceEngine.java:78) > > >> > > at > > >> > org.apache.drill.exec.server.Drillbit.(Drillbit.java:108) > > >> > > at > > >> org.apache.drill.exec.server.Drillbit.start(Drillbit.java:285) > > >> > > at > > >> org.apache.drill.exec.server.Drillbit.start(Drillbit.java:271) > > >> > > at > > >> org.apache.drill.exec.server.Drillbit.main(Drillbit.java:267) > > >> > > > > >> > > It will working if I remove > > jars/3rdparty/netty-all-4.0.23.Final.jar, > > >> the > > >> > > drill can startup. I think there have some package dependency > > version > > >> > > issue, do you think so ? > > >> > > > > >> > > > > >> > > > > >> > > 2016-06-15 8:14 GMT+08:00 Aditya : > > >> > > > > >> > >> HBase 1.x support has been merged and is available in latest > > >> > >> 1.7.0-SNAPSHOT > > >> > >> builds. > > >> > >> > > >> > >> On Wed, Jun 1, 2016 at 1:23 PM, Aditya > > >> wrote: > > >> > >> > > >> > >> > Thanks Jacques for promptly reviewing my long series of > patches! > > >> > >> > > > >> > >> > I'm planning to merge the HBase 1.x support some time in next > 48 > > >> > hours. > > >> > >> > > > >> > >> > If anyone else is interested and willing, please review the > > latest > > >> > patch > > >> > >> > here[1]. > > >> > >> > > > >> > >> > aditya... > > >> > >> > > > >> > >> > [1] https://github.com/apache/drill/pull/443/files > > >> > >> > > > >> > >> > > >> > > > > >> > > > > >> > > > >> > > > > > > > > > --94eb2c000e18dfac080535c003d6--