Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3B467200B62 for ; Fri, 12 Aug 2016 14:27:28 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 39BE9160AB0; Fri, 12 Aug 2016 12:27:28 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 581B2160AA6 for ; Fri, 12 Aug 2016 14:27:27 +0200 (CEST) Received: (qmail 78005 invoked by uid 500); 12 Aug 2016 12:27:26 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 77992 invoked by uid 99); 12 Aug 2016 12:27:26 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Aug 2016 12:27:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 8BD371A5390 for ; Fri, 12 Aug 2016 12:27:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.181 X-Spam-Level: * X-Spam-Status: No, score=1.181 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=splicemachine.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id U-OwWF4ZfLQJ for ; Fri, 12 Aug 2016 12:27:23 +0000 (UTC) Received: from mail-pa0-f50.google.com (mail-pa0-f50.google.com [209.85.220.50]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 214385F3F3 for ; Fri, 12 Aug 2016 12:27:22 +0000 (UTC) Received: by mail-pa0-f50.google.com with SMTP id fi15so8462878pac.1 for ; Fri, 12 Aug 2016 05:27:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=splicemachine.com; s=google; h=content-transfer-encoding:from:mime-version:subject:message-id:date :references:to; bh=dOHbotaF1kcpGHudH4TW8ERPqahtovivQcBf8ITFJ0w=; b=Iwpx6SrbI5+yWbuZGGDh9rMi1Y9BZ/OzXOYJJsuVu3Uz5EENJ4a0NwEX5vEymu6+7j diFTfjRf80wSoQ9oY1SUL6psBg7cVF2ifh6yCc9B0dTiM7nEPCs95mXDv/YNkllHd/GX nvSwlBx0h8Dh9ijgl+eCs1VjrdNt9nrrfnDi8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:message-id:date:references:to; bh=dOHbotaF1kcpGHudH4TW8ERPqahtovivQcBf8ITFJ0w=; b=JRPAs7r/X9n8EBsrSA5Qbmwdb6vxWByl6dOdleNGAZl0qjKzkCoA1KESPRbeZrxE9d PPH0IhE98r8J+/1cjIwG+B2heDVBVm21cY8QkDrx7u5AbbYwmXIDr4qZ/FsaywTD7Hbc 9A45/h1AvPnTGgqC3ZdSg9ekESmIu/2/U9tuALGvTQb2I+PdzUb0oQgWZQ58e5To7kPO xqrOULHT2BL+U2asnyWJuoT+OOUqrJO1PYZtaLKRrInb25KRMDGAV6ImQ4gHBPUvTf0x Pldnqfve9snzrA9U2ea7UUK+wFsjDPolYziJEQ9bweClNunlzeN/adpYmc2azONf/DCh cr2A== X-Gm-Message-State: AEkoousYsZaJ4EsN68RwAzUMi5z7j7x+hqwSvBwyJQYeE+DJA1ENB3xSRxUG+wc0x1cyJA== X-Received: by 10.66.132.38 with SMTP id or6mr26542583pab.84.1471004840571; Fri, 12 Aug 2016 05:27:20 -0700 (PDT) Received: from [10.0.0.13] (ip68-0-181-77.tc.ph.cox.net. [68.0.181.77]) by smtp.gmail.com with ESMTPSA id o8sm12877199pav.5.2016.08.12.05.27.19 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 12 Aug 2016 05:27:19 -0700 (PDT) Content-Transfer-Encoding: 7bit Content-Type: multipart/alternative; boundary=Apple-Mail-1013EE48-2791-4F2A-84E3-6B86322947E6 From: Amit Mudgal Mime-Version: 1.0 (1.0) Subject: Issues while Running Apache Phoenix against TPC-H data Message-Id: <1B655E92-6784-4EAA-8C2D-31EF1B23D406@splicemachine.com> Date: Fri, 12 Aug 2016 05:27:18 -0700 References: <6B9F601F-3762-4154-97AD-767176793CB6@splicemachine.com> To: dev@phoenix.apache.org X-Mailer: iPhone Mail (13G35) archived-at: Fri, 12 Aug 2016 12:27:28 -0000 --Apple-Mail-1013EE48-2791-4F2A-84E3-6B86322947E6 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable >=20 > Hi Dev team, >=20 > I was evaluating Apache Phoenix against the TPC-H data based on the presen= tation given at Hadoop summit in june stating that most TPC-H queries should= run. > Here is the setup details i have in my local environment : >=20 > 1. One master node and 3 region servers with 3.6 TB Disks space, 62.9 GB m= emory with 24 CPU cores (OS: centos-release-6-8.el6.centos.12.3.x86_64 ) > 2. I am running the phoenix parcel (4.7.0) on Cloudera 5.7.2-1.cdh5.7.2.p0= .18.=20 >=20 > The data got uploaded and a compaction was manually triggered on hbase. > There were 2 problems we were trying to find the answer to : >=20 > 1. While doing explain plan on standard TPCH data on LINEITEM table provid= ed it shows 8,649,179,394 rows but there are only 600,000,000 records upload= ed. >=20 > 0: jdbc:phoenix> explain select * from TPCH.LINEITEM where L_SUPPKEY =3D 7= 68951; > +-------------------------------------------------------------------------= --------------------------------------+ > | PLAN = | > +-------------------------------------------------------------------------= --------------------------------------+ > | CLIENT 1458-CHUNK 8649179394 ROWS 424044167376 BYTES PARALLEL 1-WAY ROUN= D ROBIN FULL SCAN OVER TPCH.LINEITEM | > | SERVER FILTER BY L_SUPPKEY =3D 768951 = | > +-------------------------------------------------------------------------= --------------------------------------+ > 2 rows selected (3.036 seconds)=20 >=20 > I could not do a count(*) on the table due to the fact that it always fail= ed for me with the error code Error: Operation timed out. (state=3DTIM01,cod= e=3D6000) >=20 > 2. Secondly, I was not able to also run a simple query01 published by TPCH= as it times out regularly: >=20 >=20 > 0: jdbc:phoenix:stl-colo-srv050> select l_returnflag, l_linestatus,sum(l_q= uantity) as sum_qty,sum(l_extendedprice) as sum_base_price,sum(l_extendedpri= ce * (1 - l_discount)) as sum_disc_price,sum(l_extendedprice * (1 - l_discou= nt) * (1 + l_tax)) as sum_charge,avg(l_quantity) as avg_qty,avg(l_extendedpr= ice) as avg_price,avg(l_discount) as avg_disc,count(*) as count_order from T= PCH.lineitem where l_shipdate <=3D current_date()- 90 group by l_returnflag,= l_linestatus order by l_returnflag,l_linestatus > . . . . . . . . . . . . . . . . . . . . . . .> ; > Error: Operation timed out. (state=3DTIM01,code=3D6000) > java.sql.SQLTimeoutException: Operation timed out. > at org.apache.phoenix.exception.SQLExceptionCode$14.newException(SQ= LExceptionCode.java:359) > at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQL= ExceptionInfo.java:145) > at org.apache.phoenix.iterate.BaseResultIterators.getIterators(Base= ResultIterators.java:728) > at org.apache.phoenix.iterate.BaseResultIterators.getIterators(Base= ResultIterators.java:638) > at org.apache.phoenix.iterate.MergeSortResultIterator.getMinHeap(Me= rgeSortResultIterator.java:72) > at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(M= ergeSortResultIterator.java:93) > at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSor= tResultIterator.java:58) > at org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.= next(BaseGroupedAggregatingResultIterator.java:64) > at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.j= ava:778) > at sqlline.BufferedRows.(BufferedRows.java:37) > at sqlline.SqlLine.print(SqlLine.java:1650) > at sqlline.Commands.execute(Commands.java:833) > at sqlline.Commands.sql(Commands.java:732) > at sqlline.SqlLine.dispatch(SqlLine.java:808) > at sqlline.SqlLine.begin(SqlLine.java:681) > at sqlline.SqlLine.start(SqlLine.java:398) > at sqlline.SqlLine.main(SqlLine.java:292) > 0: jdbc:phoenix:stl-colo-srv050>=20 >=20 > On firing smaller queries like attaching a limit the data comes in fine : >=20 >=20 > 0: jdbc:phoenix:stl-colo-srv052> select * from TPCH.LINEITEM limit 10; > +-------------+------------+------------+---------------+-------------+---= ---------------+-------------+--------+---------------+---------------+-----= ---------------------+--------------------------+---------+ > | L_ORDERKEY | L_PARTKEY | L_SUPPKEY | L_LINENUMBER | L_QUANTITY | L_= EXTENDEDPRICE | L_DISCOUNT | L_TAX | L_RETURNFLAG | L_LINESTATUS | = L_SHIPDATE | L_COMMITDATE | L_ | > +-------------+------------+------------+---------------+-------------+---= ---------------+-------------+--------+---------------+---------------+-----= ---------------------+--------------------------+---------+ > | 1 | 15518935 | 768951 | 1 | 17 | 33= 203.72 | 0.04 | 0.02 | N | O | 1996= -03-13 00:00:00.000 | 1996-02-12 00:00:00.000 | 1996-03 | > | 1 | 6730908 | 730909 | 2 | 36 | 69= 788.52 | 0.09 | 0.06 | N | O | 1996= -04-12 00:00:00.000 | 1996-02-28 00:00:00.000 | 1996-04 | > | 1 | 6369978 | 369979 | 3 | 8 | 16= 381.28 | 0.1 | 0.02 | N | O | 1996= -01-29 00:00:00.000 | 1996-03-05 00:00:00.000 | 1996-01 | > | 1 | 213150 | 463151 | 4 | 28 | 29= 767.92 | 0.09 | 0.06 | N | O | 1996= -04-21 00:00:00.000 | 1996-03-30 00:00:00.000 | 1996-05 | > | 1 | 2402664 | 152671 | 5 | 24 | 37= 596.96 | 0.1 | 0.04 | N | O | 1996= -03-30 00:00:00.000 | 1996-03-14 00:00:00.000 | 1996-04 | > | 1 | 1563445 | 63448 | 6 | 32 | 48= 267.84 | 0.07 | 0.02 | N | O | 1996= -01-30 00:00:00.000 | 1996-02-07 00:00:00.000 | 1996-02 | > | 2 | 10616973 | 116994 | 1 | 38 | 71= 798.72 | 0 | 0.05 | N | O | 1997= -01-28 00:00:00.000 | 1997-01-14 00:00:00.000 | 1997-02 | > | 3 | 429697 | 179698 | 1 | 45 | 73= 200.15 | 0.06 | 0 | R | F | 1994= -02-02 00:00:00.000 | 1994-01-04 00:00:00.000 | 1994-02 | > | 3 | 1903543 | 653547 | 2 | 49 | 75= 776.05 | 0.1 | 0 | R | F | 1993= -11-09 00:00:00.000 | 1993-12-20 00:00:00.000 | 1993-11 | > | 3 | 12844823 | 344848 | 3 | 27 | 47= 713.86 | 0.06 | 0.07 | A | F | 1994= -01-16 00:00:00.000 | 1993-11-22 00:00:00.000 | 1994-01 | > +-------------+------------+------------+---------------+-------------+---= ---------------+-------------+--------+---------------+---------------+-----= ---------------------+--------------------------+---------+ > 10 rows selected (0.603 seconds) > 0: jdbc:phoenix:stl-colo-srv052>=20 >=20 >=20 > I am sure i am doing something wrong here and would greatly appreciate if y= ou could please point me to the same. >=20 > Thanks Again >=20 > Amit --Apple-Mail-1013EE48-2791-4F2A-84E3-6B86322947E6--