Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 806FD1878E for ; Wed, 30 Dec 2015 20:43:22 +0000 (UTC) Received: (qmail 14529 invoked by uid 500); 30 Dec 2015 20:43:20 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 14464 invoked by uid 500); 30 Dec 2015 20:43:20 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 14454 invoked by uid 99); 30 Dec 2015 20:43:20 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Dec 2015 20:43:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 8F6C21A0285 for ; Wed, 30 Dec 2015 20:43:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.102 X-Spam-Level: **** X-Spam-Status: No, score=4.102 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, KAM_LINEPADDING=1.2, LOTS_OF_MONEY=0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=handybook.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id b2R0tYgErpXL for ; Wed, 30 Dec 2015 20:43:05 +0000 (UTC) Received: from mail-vk0-f51.google.com (mail-vk0-f51.google.com [209.85.213.51]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 996F720CC7 for ; Wed, 30 Dec 2015 20:43:04 +0000 (UTC) Received: by mail-vk0-f51.google.com with SMTP id a123so50676824vkh.1 for ; Wed, 30 Dec 2015 12:43:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=handybook.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=mXWlAio9oLUVgoquWP61AnVSsLioU370GBS4Vnq1ooE=; b=Nb/LDIlXElPtiydQQHEjoJumv3ruNQZky6NA3T0kBXtEsUshFUFHKBTwYvkvSP6E8S Bfb8d6erssLU4drr4Hi5KdC/W1LuFreDw4rr+9oWraSGNSJmQOKBrsRwqeMKt34mOmXl co4Hj7njBjxYQRuVSOoEdNiDntJccHL/15910= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=mXWlAio9oLUVgoquWP61AnVSsLioU370GBS4Vnq1ooE=; b=YInbrviYwz1MXu6Ia+MHtOxtmPY+ZXAa6cx8qOKpcSg4adHW1JxRmyZpnF1wzuZ2aj pU6clVHjS4aogCvB7odD9BtJfKSh9oCrAs1GOqE8uqTiVxFbrGVgOHANw9i1h5HCXZYu YaL5E7Ly4emi/iqOA4ztuazqDJv4Xfc/4ENa5hHlwGuUma5/x7bwZjSOT1bgDR/HmPPF 7LsNeSxCHJQpBiociFQgifFJUbpC1O1uZLrUP5p+yHsdQkcduBxG2u405IjnQJsaqzVC Pzjy1iwFAc0SWQOOilu6fYMRMmCNwQc2Uzs9AxUf1MeLbJQLCsbLu7F1uH7ca5QzN4FJ mCfg== X-Gm-Message-State: ALoCoQlGPLrpFRVop824I2cY6283mWoO0XDL5rZq7fPCCGTh7oZ546X9ZLtrv1IsPYQYOTXhODeMzB3PgPj0IHPI/oHgESGT305UepaH9twC3D/O/GoqHuhEiXUBAugGRXWefDEukgAq/soWRBoV9I/l2zbcFa7gy2sHmks6zGtnIqPHxP5O+lZFI0SOhTYzTRKQPHmJckEbsuMCf/0Xn8hqnZjXUP4X+w== X-Received: by 10.31.149.75 with SMTP id x72mr43635049vkd.81.1451508178142; Wed, 30 Dec 2015 12:42:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.31.177.194 with HTTP; Wed, 30 Dec 2015 12:42:38 -0800 (PST) In-Reply-To: <01ee01d14341$5925a6b0$0b70f410$@peridale.co.uk> References: <015001d14227$2e852310$8b8f6930$@peridale.co.uk> <3EDBB066-514E-4C4C-90D1-E1802B05A7FB@gmail.com> <01aa01d14300$4f4afaf0$ede0f0d0$@peridale.co.uk> <2F5F6F5F-FA73-4D85-82BD-3A60281EBD9D@gmail.com> <01e101d14337$071e65f0$155b31d0$@peridale.co.uk> <01ee01d14341$5925a6b0$0b70f410$@peridale.co.uk> From: Marcin Tustin Date: Wed, 30 Dec 2015 15:42:38 -0500 Message-ID: Subject: Re: Running the same query on 1 billion rows fact table in Hive on Spark compared to Sybase IQ columnar database To: user@hive.apache.org Content-Type: multipart/related; boundary=001a1142616058ec9d0528239533 --001a1142616058ec9d0528239533 Content-Type: multipart/alternative; boundary=001a1142616058ec990528239532 --001a1142616058ec990528239532 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I'm afraid I use the HDP distribution so I haven't yet had to compile anything. (Incidentally, this isn't a recommendation of HDP over anything else). On Wed, Dec 30, 2015 at 3:33 PM, Mich Talebzadeh wrote: > Thanks Marcin > > > > Trying to build TEZ 0.7 in > > > > /usr/lib/apache-tez-0.7.0-src > > > > using > > > > mvn -X clean package -DskipTests=3Dtrue -Dmaven.javadoc.skip=3Dtrue > > > > with mvn version 3.2.5 (as opposed to 3.3) as I read that I can build it > OK with 3.2.5 following the same error ass below > > > > mvn --version > > Apache Maven *3.2.5* (12a6b3acb947671f09b81f49094c53f426d8cea1; > 2014-12-14T17:29:23+00:00) > > Maven home: /usr/local/apache-maven/apache-maven-3.2.5 > > Java version: 1.7.0_25, vendor: Oracle Corporation > > Java home: /usr/java/jdk1.7.0_25/jre > > > > *I get this error* > > > > [INFO] tez-ui ............................................. FAILURE [ > 0.411 s] > > [ > > > > DEBUG] -- end configuration -- > > [INFO] Running 'npm install --color=3Dfalse' in > /usr/lib/apache-tez-0.7.0-src/tez-ui/src/main/webapp > > [INFO] > /usr/lib/apache-tez-0.7.0-src/tez-ui/src/main/webapp/node/with_new_path.s= h: > line 3: 23781 Aborted "$@" > > > > > > [ERROR] Failed to execute goal > com.github.eirslett:frontend-maven-plugin:0.0.16:npm (npm install) on > project tez-ui: Failed to run task: 'npm install --color=3Dfalse' failed. > (error code 134) -> [Help 1] > > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal com.github.eirslett:frontend-maven-plugin:0.0.16:npm (npm install) o= n > project tez-ui: Failed to run task > > > > > > any ideas as there is little info available in net. > > > > > > Thanks > > > > Mich Talebzadeh > > > > *Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-0919= 08.pdf > > Author of the books* "A Practitioner=E2=80=99s Guide to Upgrading to Syba= se ASE > 15", ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, vol= ume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technolo= gy > Ltd, its subsidiaries or their employees, unless expressly so stated. It = is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employee= s > accept any responsibility. > > > > *From:* Marcin Tustin [mailto:mtustin@handybook.com] > *Sent:* 30 December 2015 19:27 > > *To:* user@hive.apache.org > *Subject:* Re: Running the same query on 1 billion rows fact table in > Hive on Spark compared to Sybase IQ columnar database > > > > I'm using TEZ 0.7.0.2.3 with hive 1.2.1.2.3. I can confirm that TEZ is > much faster than MR in pretty much all cases. Also, with hive, you'll mak= e > sure you've performed optimizations like aligning ORC stripe sizes with > HDFS block sizes, and concatenated your tables (not so much an optimizati= on > as a must for avoiding the small files problem). > > > > On Wed, Dec 30, 2015 at 2:19 PM, Mich Talebzadeh > wrote: > > Thanks again Jorn. > > > > > > Both Hive and Sybase IQ are running on the same host. Yes for Sybase IQ I > have compression enabled. The FACT table in IQ (sales) has LF (read bitma= p) > indexes on the time_id column. For the dimension table (times) I have > time_id defined as primary key. Also Sybase IQ creates FP (fast projectio= n) > indexes on every column by default. > > > > Anyway I am trying to download and build TEZ. Do we know which version of > TEZ works with Hive 1.2.1 please? 0.8 seems to be in alpha > > > > Thanks > > > > Mich Talebzadeh > > > > *Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-0919= 08.pdf > > Author of the books* "A Practitioner=E2=80=99s Guide to Upgrading to Syba= se ASE > 15", ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, vol= ume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technolo= gy > Ltd, its subsidiaries or their employees, unless expressly so stated. It = is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employee= s > accept any responsibility. > > > > *From:* J=C3=B6rn Franke [mailto:jornfranke@gmail.com] > *Sent:* 30 December 2015 16:29 > > > *To:* user@hive.apache.org > *Subject:* Re: Running the same query on 1 billion rows fact table in > Hive on Spark compared to Sybase IQ columnar database > > > > > Hmm i think the execution Engine TEZ has (currently) the most > optimizations on Hive. What about your hardware - is it the same? Do you > have also compression on Sybase? > > Alternatively you need to wait for Hive for interactive analytics (tez 0.= 8 > + llap). > > > On 30 Dec 2015, at 13:47, Mich Talebzadeh wrote: > > Hi Jorn, > > > > Thanks for your reply. My Hive version is 1.2.1 on Spark 1.3.1. I have no= t > tried it on TEZ. I tried the query on MR engine and it did nor fair bette= r. > I also ran it without SDDDEV function and found out that the function did > not slow it down. > > > > I tried a simple query as follows builr in sales FACT table 1e9 rows and > dimension table times (1826 rows) > > > > -- > > -- Get the total amount sold for each calendar month > > -- > > *SELECT t.calendar_month_desc, SUM(s.amount_sold)* > > *FROM sales s, times t WHERE s.time_id =3D t.time_id* > > *GROUP BY t.calendar_month_desc;* > > > > Now Sybase IQ comes back in around 30 seconds. > > > > Started query at Dec 30 2015 08:14:33:399AM > > (48 rows affected) > > Finished query at Dec 30 2015 08:15:04:640AM > > > > Whereas Hive with the following setting and running the same query > > > > set > hive.input.format=3Dorg.apache.hadoop.hive.ql.io.BucketizedHiveInputForma= t; > > set hive.optimize.bucketmapjoin=3Dtrue; > > set hive.optimize.bucketmapjoin.sortedmerge=3Dtrue; > > > > Comes back in > > > > 48 rows selected (1514.687 seconds) > > > > I don=E2=80=99t know what else can be done. Obviously this is all schema = on read > so I am not sure I can change bucketing on FACT table based on one query > alone! > > > > > > > > +--------------------------------------------------------------------+--+ > > | createtab_stmt | > > +--------------------------------------------------------------------+--+ > > | CREATE TABLE `times`( | > > | `time_id` timestamp, | > > | `day_name` varchar(9), | > > | `day_number_in_week` int, | > > | `day_number_in_month` int, | > > | `calendar_week_number` int, | > > | `fiscal_week_number` int, | > > | `week_ending_day` timestamp, | > > | `week_ending_day_id` bigint, | > > | `calendar_month_number` int, | > > | `fiscal_month_number` int, | > > | `calendar_month_desc` varchar(8), | > > ---------- > > | `days_in_fis_year` bigint, | > > | `end_of_cal_year` timestamp, | > > | `end_of_fis_year` timestamp) | > > | CLUSTERED BY ( | > > | time_id) | > > | INTO 256 BUCKETS | > > | ROW FORMAT SERDE | > > | 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' | > > | STORED AS INPUTFORMAT | > > | 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' | > > | OUTPUTFORMAT | > > | 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' | > > | LOCATION | > > | 'hdfs://rhes564:9000/user/hive/warehouse/oraclehadoop.db/times' | > > | TBLPROPERTIES ( | > > | 'COLUMN_STATS_ACCURATE'=3D'true', | > > | 'numFiles'=3D'1', | > > | 'numRows'=3D'1826', | > > | 'orc.bloom.filter.columns'=3D'TIME_ID', | > > | 'orc.bloom.filter.fpp'=3D'0.05', | > > | 'orc.compress'=3D'SNAPPY', | > > | 'orc.create.index'=3D'true', | > > | 'orc.row.index.stride'=3D'10000', | > > | 'orc.stripe.size'=3D'268435456', | > > | 'rawDataSize'=3D'0', | > > | 'totalSize'=3D'11155', | > > | 'transient_lastDdlTime'=3D'1451429900') | > > > > ; > > > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technolo= gy > Ltd, its subsidiaries or their employees, unless expressly so stated. It = is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employee= s > accept any responsibility. > > > > *From:* J=C3=B6rn Franke [mailto:jornfranke@gmail.com ] > *Sent:* 30 December 2015 08:28 > *To:* user@hive.apache.org > *Subject:* Re: Running the same query on 1 billion rows fact table in > Hive on Spark compared to Sybase IQ columnar database > > > > Have you tried it with Hive ob TEZ? It contains (currently) more > optimizations than Hive on Spark. > > I assume you use the latest Hive version. > > Additionally you may want to think about calculating statistics (dependin= g > on your configuration you need to trigger it) - I am not sure if Spark ca= n > use them. > > I am not sure if bloom filters on the columns you mention make sense. You > may also want to increase stride size (depending on your data). > > Currently you bucket by a lot of fields, which may not make sense. You > also may want to sort the data by customer Id in the table. > > You also seem to have a lot of reducers, which you may want to decrease. > > > > Have you tried without "having stddev_samp" ? Is the query exactly the > same as in Sybase? > > > On 29 Dec 2015, at 11:53, Mich Talebzadeh wrote: > > Hi, > > > > I have a fact table in Hive imported from Sybase IQ via SQOOP with 1 > billion rows as follows: > > > > show create table sales; > > > +------------------------------------------------------------------------= -------+--+ > > | > createtab_stmt | > > > +------------------------------------------------------------------------= -------+--+ > > | CREATE TABLE > `sales`( | > > | `prod_id` > bigint, | > > | `cust_id` > bigint, | > > | `time_id` > timestamp, | > > | `channel_id` > bigint, | > > | `promo_id` > bigint, | > > | `quantity_sold` > decimal(10,0), | > > | `amount_sold` > decimal(10,0)) | > > | CLUSTERED BY ( > | > > | > prod_id, > | > > | > cust_id, > | > > | time_id, > | > > | > channel_id, > | > > | > promo_id) > | > > | INTO 256 BUCKETS > | > > | ROW FORMAT > SERDE | > > | > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > | > > | STORED AS INPUTFORMAT > | > > | > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > | > > | > OUTPUTFORMAT > | > > | 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > | > > | > LOCATION > | > > | > 'hdfs://rhes564:9000/user/hive/warehouse/oraclehadoop.db/sales' > | > > | TBLPROPERTIES > ( | > > | > 'COLUMN_STATS_ACCURATE'=3D'true', > | > > | > 'last_modified_by'=3D'hduser', > | > > | > 'last_modified_time'=3D'1451305626', > | > > | > 'numFiles'=3D'11', > | > > | > 'numRows'=3D'1000000000', > | > > | > 'orc.bloom.filter.columns'=3D'PROD_ID,CUST_ID,TIME_ID,CHANNEL_ID,PROMO_ID= ', > | > > | > 'orc.bloom.filter.fpp'=3D'0.05', > | > > | > 'orc.compress'=3D'SNAPPY', > | > > | > 'orc.create.index'=3D'true', > | > > | > 'orc.row.index.stride'=3D'10000', > | > > | > 'orc.stripe.size'=3D'268435456', > | > > | > 'rawDataSize'=3D'296000000000', > | > > | 'totalSize'=3D'2678882153', > | > > | > 'transient_lastDdlTime'=3D'1451305626') > | > > > +------------------------------------------------------------------------= -------+--+ > > > > I use the following query to run against sales table only against Hive > > > > SELECT > > rs.Customer_ID > > , rs.Number_of_orders > > , rs.Total_customer_amount > > , rs.Average_order > > , rs.Standard_deviation > > FROM > > ( > > SELECT cust_id AS Customer_ID, > > COUNT(amount_sold) AS Number_of_orders, > > SUM(amount_sold) AS Total_customer_amount, > > AVG(amount_sold) AS Average_order, > > stddev_samp(amount_sold) AS Standard_deviation > > FROM sales > > GROUP BY cust_id > > HAVING SUM(amount_sold) > 94000 > > AND AVG(amount_sold) < stddev_samp(amount_sold) > > ) rs > > ORDER BY > > -- Total_customer_amount DESC > > 3 DESC > > > > Hive comes back in 17 minutes with 5,948 rows > > > > bl -f sales.hql > sales.log > > Connecting to jdbc:hive2://rhes564:10010/default > > Connected to: Apache Hive (version 1.2.1) > > Driver: Hive JDBC (version 1.2.1) > > Transaction isolation: TRANSACTION_REPEATABLE_READ > > Running init script /home/hduser/dba/bin/hive_on_spark_init.hql > > No rows affected (0.097 seconds) > > No rows affected (0.001 seconds) > > No rows affected (0.001 seconds) > > No rows affected (0.038 seconds) > > INFO : Warning: Using constant number 3 in order by. If you try to use > position alias when hive.groupby.orderby.position.alias is false, the > position alias will be ignored. > > INFO : > > Query Hive on Spark job[0] stages: > > INFO : 0 > > INFO : 1 > > INFO : 2 > > INFO : > > Status: Running (Hive on Spark job[0]) > > INFO : Job Progress Format > > CurrentTime StageId_StageAttemptId: > SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount > [StageCost] > > INFO : 2015-12-29 09:33:25,815 Stage-0_0: 0/11 Stage-1_0: 0/1009 > Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:28,829 Stage-0_0: 0/11 Stage-1_0: 0/1009 > Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:31,857 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:34,875 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:37,903 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:40,918 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:43,939 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:46,958 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:49,971 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:52,991 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:33:56,007 Stage-0_0: 0(+2)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > > > INFO : 2015-12-29 09:50:03,578 Stage-0_0: 10(+1)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:06,590 Stage-0_0: 10(+1)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:09,602 Stage-0_0: 10(+1)/11 Stage-1_0: > 0/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:10,606 Stage-0_0: 11/11 Finished Stage-1_0= : > 0(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:11,610 Stage-0_0: 11/11 Finished Stage-1_0= : > 6(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:12,618 Stage-0_0: 11/11 Finished Stage-1_0= : > 30(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:13,622 Stage-0_0: 11/11 Finished Stage-1_0= : > 59(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:14,626 Stage-0_0: 11/11 Finished Stage-1_0= : > 90(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:15,631 Stage-0_0: 11/11 Finished Stage-1_0= : > 124(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:16,654 Stage-0_0: 11/11 Finished Stage-1_0= : > 160(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:17,659 Stage-0_0: 11/11 Finished Stage-1_0= : > 193(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:18,663 Stage-0_0: 11/11 Finished Stage-1_0= : > 228(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:19,667 Stage-0_0: 11/11 Finished Stage-1_0= : > 262(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:20,672 Stage-0_0: 11/11 Finished Stage-1_0= : > 298(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:21,679 Stage-0_0: 11/11 Finished Stage-1_0= : > 338(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:22,687 Stage-0_0: 11/11 Finished Stage-1_0= : > 376(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:23,691 Stage-0_0: 11/11 Finished Stage-1_0= : > 417(+3)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:24,696 Stage-0_0: 11/11 Finished Stage-1_0= : > 460(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:25,699 Stage-0_0: 11/11 Finished Stage-1_0= : > 502(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:26,707 Stage-0_0: 11/11 Finished Stage-1_0= : > 542(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:27,712 Stage-0_0: 11/11 Finished Stage-1_0= : > 584(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:28,719 Stage-0_0: 11/11 Finished Stage-1_0= : > 624(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:29,730 Stage-0_0: 11/11 Finished Stage-1_0= : > 667(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:30,736 Stage-0_0: 11/11 Finished Stage-1_0= : > 709(+3)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:31,740 Stage-0_0: 11/11 Finished Stage-1_0= : > 754(+3)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:32,743 Stage-0_0: 11/11 Finished Stage-1_0= : > 797(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:33,747 Stage-0_0: 11/11 Finished Stage-1_0= : > 844(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:34,754 Stage-0_0: 11/11 Finished Stage-1_0= : > 888(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:35,759 Stage-0_0: 11/11 Finished Stage-1_0= : > 934(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:36,764 Stage-0_0: 11/11 Finished Stage-1_0= : > 981(+2)/1009 Stage-2_0: 0/1 > > INFO : 2015-12-29 09:50:37,768 Stage-0_0: 11/11 Finished Stage-1_0= : > 1009/1009 Finished Stage-2_0: 0(+1)/1 > > INFO : 2015-12-29 09:50:38,771 Stage-0_0: 11/11 Finished Stage-1_0= : > 1009/1009 Finished Stage-2_0: 1/1 Finished > > INFO : Status: Finished successfully in 1036.00 seconds > > *5,948 rows selected (1074.817 seconds)* > > > > So it returns 5948 rows in 17 minutes. In contrast IQ returns 5947 rows i= n > 23 seconds > > > > Sybase IQ is a columnar database so each column is created as a fast > projection index by default. In addition I have created LF (bitmap) index= es > on dimension columns (PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID). N= ow > the query only touches CUST_ID. > > > > My suspicion is that it is the Standard Deviation function stddev_samp() = that > could be the bottleneck? > > > > Thanks > > > > Mich Talebzadeh > > > > *Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-0919= 08.pdf > > Author of the books* "A Practitioner=E2=80=99s Guide to Upgrading to Syba= se ASE > 15", ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, vol= ume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technolo= gy > Ltd, its subsidiaries or their employees, unless expressly so stated. It = is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employee= s > accept any responsibility. > > > > > > > > Want to work at Handy? Check out our culture deck and open roles > > > Latest news at Handy > > Handy just raised $50m > led > by Fidelity > > > > [image: Image removed by sender.] > --=20 Want to work at Handy? Check out our culture deck and open roles=20 Latest news at Handy Handy just raised $50m=20 led=20 by Fidelity --001a1142616058ec990528239532 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I'm afraid I use the HDP distribution so I haven't= yet had to compile anything. (Incidentally, this isn't a recommendatio= n of HDP over anything else).=C2=A0

On Wed, Dec 30, 2015 at 3:33 PM, Mich Talebzadeh <mich@peridale.co.uk> wrote:

Thanks Marcin

= =C2=A0

Trying to build TEZ 0.7 in=

=C2=A0

=

/usr/lib/apache-tez-0.7.0-src

=C2=A0

using

=C2=A0

mvn -X clean package -DskipTest= s=3Dtrue -Dmaven.javadoc.skip=3Dtrue

=C2=A0

with mvn ver= sion 3.2.5 (as opposed to 3.3) as I read that I can build it OK with 3.2.5 = following the same error ass below

=C2=A0

mvn --vers= ion

Apache Maven 3.2= .5 (12a6b3acb947671f09b81f49094c53f426d8cea1; 2014-12-14T17:29:23+00:00= )

Maven home: /usr/loca= l/apache-maven/apache-maven-3.2.5

Java version: 1.7.0_25, vendor: Oracle Corporation=

Java home: /usr/java/jdk1.7.0_25/jre=

=C2=A0<= /p>

I get this error

=C2=A0

<= span style=3D"font-size:11.0pt;font-family:"Arial",sans-serif">[<= /span>INFO] tez-ui ............................................. FAILU= RE [=C2=A0 0.411 s]

[

=C2=A0

DEBUG] -- end configuration --=

[INFO] Running 'npm install = --color=3Dfalse' in /usr/lib/apache-tez-0.7.0-src/tez-ui/src/main/webap= p

[INFO] /usr/lib/apach= e-tez-0.7.0-src/tez-ui/src/main/webapp/node/with_new_path.sh: line 3: 23781= Aborted=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0"$@"

=C2=A0

=C2=A0

[ERROR] Fa= iled to execute goal com.github.eirslett:frontend-maven-plugin:0.0.16:npm (= npm install) on project tez-ui: Failed to run task: 'npm install --colo= r=3Dfalse' failed. (error code 134) -> [Help 1]=

org.apache.maven.lifecycle.LifecycleExecuti= onException: Failed to execute goal com.github.eirslett:frontend-maven-plug= in:0.0.16:npm (npm install) on project tez-ui: Failed to run task=

=C2=A0

<= p class=3D"MsoNormal">=C2=A0

= a= ny ideas as there is little info available in net.

=

=C2=A0

=C2=A0

Thank= s

=C2=A0=

Mich Talebzadeh

=C2=A0

Sybas= e ASE 15 Gold Medal Award 2008

A Winning Strategy: Runni= ng the most Critical Financial Data on ASE 15

http://login.sybase.com/= files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

A= uthor of the books "A Practitioner=E2=80=99s Guide to Upgrading to = Sybase ASE 15", ISBN 978-0-9563693-0-7.

<= p class=3D"MsoNormal" style=3D"text-autospace:none">co-author = "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-97596= 93-0-4

Publications due shortly:<= u>

Complex Event Processing in He= terogeneous Environments, ISBN: 978-0-9563693-3-8<= /u>

= Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-= 4, volume one out shortly<= /span>

= =C2=A0

h= ttp://talebzadehmich.wordpress.com

=C2=A0

N= OTE: The information in this email is proprietary and confidential. This me= ssage is for the designated recipient only, if you are not the intended rec= ipient, you should destroy it immediately. Any information in this message = shall not be understood as given or endorsed by Peridale Technology Ltd, it= s subsidiaries or their employees, unless expressly so stated. It is the re= sponsibility of the recipient to ensure that this email is virus free, ther= efore neither Peridale Ltd, its subsidiaries nor their employees accept any= responsibility.

=C2=A0

From: Marcin Tustin [mailto:mtustin@handybook.com]
Sent: 30 December 2015 19:27


To:
= user@hive.apache.org
Subject: Re: Running the same query on 1= billion rows fact table in Hive on Spark compared to Sybase IQ columnar da= tabase

=C2=A0

I'm using= TEZ=C2=A00.7.0.2.3 with hive 1.2.1.2.3. I can confirm that TEZ = is much faster than MR in pretty much all cases. Also, with hive, you'l= l make sure you've performed optimizations like aligning ORC stripe siz= es with HDFS block sizes, and concatenated your tables (not so much an opti= mization as a must for avoiding the small files problem).<= /u>

=C2=A0

On Wed, Dec 30, 2015 at 2:19 PM, Mich Talebzadeh <mich@peridale.co.uk> wrote:

Thanks again Jorn.

=C2= =A0

=C2=A0

Both H= ive and Sybase IQ are running on the same host. Yes for Sybase IQ I have co= mpression enabled. The FACT table in IQ (sales) has LF (read bitmap) indexe= s on the time_id column. For the dimension table (times) I have time_id def= ined as primary key. Also Sybase IQ creates FP (fast projection) indexes on= every column by default.

=C2=A0=

Anyway I am trying to download and build TEZ. Do we know wh= ich version of TEZ works with Hive 1.2.1 please? 0.8 seems to be in alpha

=C2=A0

Thanks

=C2=A0

Mich Tale= bzadeh

=C2=A0

Sybase ASE 15 Gold Medal Award 2008

A W= inning Strategy: Running the most Critical Financial Data on ASE 15<= u>

ht= tp://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.p= df

Author of the books "A Practitioner=E2=80=99s Gu= ide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.

co-author "Sybase Transact SQL Guidelines Best Practices&quo= t;, ISBN 978-0-9759693-0-4

Pu= blications due shortly:

Complex Event Processing in Hete= rogeneous Environments, ISBN: 978-0-9563693-3-8

<= span style=3D"font-size:10.0pt;font-family:"Arial",sans-serif">Or= acle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4,= volume one out shortly=

=C2= =A0

http= ://talebzadehmich.wordpress.com

=C2=A0

NOTE:= The information in this email is proprietary and confidential. This messag= e is for the designated recipient only, if you are not the intended recipie= nt, you should destroy it immediately. Any information in this message shal= l not be understood as given or endorsed by Peridale Technology Ltd, its su= bsidiaries or their employees, unless expressly so stated. It is the respon= sibility of the recipient to ensure that this email is virus free, therefor= e neither Peridale Ltd, its subsidiaries nor their employees accept any res= ponsibility.

=C2=A0=

From: J=C3=B6rn Franke [mailto:jornfranke@gmail.com]
Sent:<= /b> 30 December 2015 16:29


To: user@hive.apache.org
Subject: Re: Running the same que= ry on 1 billion rows fact table in Hive on Spark compared to Sybase IQ colu= mnar database

=C2=A0


Hmm = i think the execution Engine TEZ has (currently) the most optimizations on = Hive. What about your hardware - is it the same? Do you have also compressi= on on Sybase?

Alternativ= ely you need to wait for Hive for interactive analytics (tez 0.8 + llap).= =C2=A0


On 30 Dec 2015, at 13:47, Mich Talebzadeh <mich@peridale.co.uk> w= rote:

Hi Jorn,

=C2=A0<= u>

Thanks for your reply. My Hive version is 1.2.1 on Sp= ark 1.3.1. I have not tried it on TEZ. I tried the query on MR engine and i= t did nor fair better. I also ran it without SDDDEV function and found out = that the function did not slow it down.

=C2=A0=

I tried a simple query as follows builr in sa= les FACT table 1e9 rows and dimension table times (1826 rows)=

=C2=A0

--=

-- Get the total amount sold for each calendar month=

--

SELECT t.calendar_month_desc, SUM(s.amount_sold)

FROM sales s, times t WHERE s.time_id =3D t.time_id

GROUP BY t.calendar_month_desc;

=C2=A0

= Now Sybase IQ comes back in around 30 seconds.

=C2=A0

Started query at Dec 30 2015= 08:14:33:399AM

(48 rows affected)

Finished query at Dec 30 2015 08:15:04= :640AM

=C2=A0

Whe= reas Hive with the following setting and running the same query

=C2=A0

set hive.input.f= ormat=3Dorg.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;

set hive.optimize.bucketmapjoin=3Dtrue;

set hive.optimize.bucketmapjoin.sortedmerge=3Dtrue;

=C2=A0

Comes back = in

=C2=A0

4= 8 rows selected (1514.687 seconds)

=C2=A0

I don=E2=80=99t know what else can be done. Obviou= sly this is all schema on read so I am not sure I can change bucketing on F= ACT table based on one query alone!

=C2=A0

=C2=A0

=C2=A0=

+---------------------------------------------= -----------------------+--+

= |=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cr= eatetab_stmt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 |

+------------------= --------------------------------------------------+--+=

| CREATE TABLE `times`(=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0|

|=C2=A0=C2=A0 `time_id` ti= mestamp,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

=

|=C2=A0=C2=A0 `day_name` varchar(9),=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= |

|=C2=A0=C2=A0 `day_number_in_week` in= t,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 |

|=C2=A0=C2=A0 `day_numb= er_in_month` int,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 `c= alendar_week_number` int,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 |

<= span style=3D"font-family:"Courier New";color:blue">|=C2=A0=C2=A0= `fiscal_week_number` int,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|= =C2=A0=C2=A0 `week_ending_day` timestamp,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|

|= =C2=A0=C2=A0 `week_ending_day_id` bigint,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2= =A0=C2=A0 `calendar_month_number` int,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0= =C2=A0 `fiscal_month_number` int,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|= =C2=A0=C2=A0 `calendar_month_desc` varchar(8),=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 |

----------<= /u>

|=C2=A0=C2=A0 `days_in_fis_year` bigint,=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=

|=C2=A0=C2=A0 `end_of_cal_year` timestamp,=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 `end_of_fis_year` timestamp)=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

| CLUSTERED BY (=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

=

| =C2=A0=C2=A0time_id)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

| INTO 256 BUCKETS=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |<= /p>

| ROW FORMAT SERDE=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2= =A0=C2=A0 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

| STORED AS INPUTFORMAT=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |<= /span>

|=C2=A0=C2=A0 'org.apache.hadoop.hiv= e.ql.io.orc.OrcInputFormat'=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

=

| OUTPUTFORMAT=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |<= /p>

|=C2=A0=C2=A0 'org.apache.hadoop.hive.ql.io.orc.OrcOutput= Format'=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 |

| LOCATION=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'hdfs://rhes564:9000/user/hive/warehouse/oraclehadoo= p.db/times'=C2=A0 |

| TBLPROPERTIES = (=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0|

|=C2=A0=C2=A0 'COLUMN_= STATS_ACCURATE'=3D'true',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 |

|=C2=A0=C2=A0 'numF= iles'=3D'1',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 '= numRows'=3D'1826',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.= bloom.filter.columns'=3D'TIME_ID',=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.bloom.filter.fpp'=3D'= 0.05',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=

|=C2=A0=C2=A0 'orc.compress'=3D'SN= APPY',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2= =A0 'orc.create.index'=3D'true',=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|

|=C2=A0=C2=A0 'orc.row.index.stride'=3D'10000&#= 39;,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.stripe.size'=3D'268435456&#= 39;,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'rawDataSize'=3D'0',= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |<= /p>

|=C2=A0=C2=A0 'totalSize'=3D'11155',=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0= =C2=A0 'transient_lastDdlTime'=3D'1451429900')=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= |

=C2=A0

;

=C2=A0

=C2=A0=

http://talebzadehmich.wordpr= ess.com

=C2=A0

NOTE: The information in this= email is proprietary and confidential. This message is for the designated = recipient only, if you are not the intended recipient, you should destroy i= t immediately. Any information in this message shall not be understood as g= iven or endorsed by Peridale Technology Ltd, its subsidiaries or their empl= oyees, unless expressly so stated. It is the responsibility of the recipien= t to ensure that this email is virus free, therefore neither Peridale Ltd, = its subsidiaries nor their employees accept any responsibility.

=C2=A0

From: J=C3=B6rn Franke [mailto:jornfranke@gmail.com]
Sent: 30 December 2015 08:= 28
To: = user@hive.apache.org
Subject: Re: Running the same query on 1= billion rows fact table in Hive on Spark compared to Sybase IQ columnar da= tabase

=C2=A0=

Have you tried it with Hive ob T= EZ? It contains (currently) more optimizations than Hive on Spark.

I assume you use the latest Hive= version.

Additionally y= ou may want to think about calculating statistics (depending on your config= uration you need to trigger it) - I am not sure if Spark can use them.

I am not sure if bloom filte= rs on the columns you mention make sense. You may also want to increase str= ide size (depending on your data).

Currently you bucket by a lot of fields, which may not make sens= e. You also may want to sort the data by customer Id in the table.

You also seem to have a lot of r= educers, which you may want to decrease.

=C2=A0

= Have you tried without "having stddev_samp" ? Is the query exactl= y the same as in Sybase?


On 29 Dec 2015, at 11:53, Mich Talebzad= eh <mich@perida= le.co.uk> wrote:

Hi,

= =C2=A0

I have a fact table in Hive imported fr= om Sybase IQ via SQOOP with 1 billion rows as follows:=

=C2=A0

show c= reate table sales;

+---= ---------------------------------------------------------------------------= -+--+

|=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 createtab_stmt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 |

+----------------= ---------------------------------------------------------------+--+<= u>

| CREATE TABLE `sales`(=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 `prod_id` bigint,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 |

|=C2= =A0=C2=A0 `cust_id` bigint,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= |

|=C2=A0=C2=A0 `time_= id` timestamp,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 `channel_id` bigint,=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 |

|=C2= =A0=C2=A0 `promo_id` bigint,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |<= /span>

|=C2=A0=C2=A0 `quantity= _sold` decimal(10,0),=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=

|=C2=A0=C2=A0 `amount_sold` d= ecimal(10,0))=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |<= /span>

| CLUSTERED BY (=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= |

|=C2=A0=C2=A0 prod_id= ,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 cust_id,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 time_id,=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0|

|= =C2=A0=C2=A0 channel_id,=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 promo_id)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

| INTO 256 BUCKETS=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|

| ROW FORMAT SERDE=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'org.apache.hadoop.hive.ql.io.orc.Orc= Serde'=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

| STORED AS INPUTFORMAT=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0|

= |=C2=A0=C2=A0 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 |

| OUTPUTFORMAT=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 |

|=C2= =A0=C2=A0 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|

| LOCATION=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'hdfs://rhes564:9000/user/hive/warehouse/oraclehad= oop.db/sales'=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 |

| TBL= PROPERTIES (=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0= =C2=A0=C2=A0|

|=C2=A0= =C2=A0 'COLUMN_STATS_ACCURATE'=3D'true',=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'last_modified_by'=3D'hduser',=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'last_modified_time'=3D'= 1451305626',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'numFiles'=3D'11',=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

=

|=C2=A0=C2=A0 'numRows'=3D'10000000= 00',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.bloom.filter.columns'=3D'PROD_ID,CUST_ID,TI= ME_ID,CHANNEL_ID,PROMO_ID',=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.bloom.filter.fpp'=3D'= 0.05',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.compress'= =3D'SNAPPY',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.create.index'=3D'true',=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.row.index.st= ride'=3D'10000',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |

|=C2=A0=C2=A0 'orc.str= ipe.size'=3D'268435456',=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 |

|=C2=A0=C2=A0 = 'rawDataSize'=3D'296000000000',=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 |

<= span style=3D"font-size:10.0pt;font-family:"Courier New";color:bl= ue">|=C2=A0=C2=A0 'totalSize'=3D'2678882153', =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|

|=C2=A0=C2=A0 'transient_lastDdlTime'=3D'1451305= 626')=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 |

+-----= --------------------------------------------------------------------------+= --+

=C2=A0

I use = the following query to run against sales table only against Hive<= /u>

=C2=A0

SELECT=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 rs.Customer_ID

=C2=A0=C2=A0= =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0, rs.Number_of_orders

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 , rs.Total_custome= r_amount

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 , rs.Average_order

=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 , rs.Standard_deviation<= u>

FROM

<= span style=3D"font-family:"Courier New";color:black">(<= /u>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 SELECT= cust_id AS Customer_ID,

=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 COUNT(amount_sold) AS Number_of_orders,

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = SUM(amount_sold) AS Total_customer_amount,

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 AVG(amount_sold) AS Average_= order,

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 stddev_samp(amount_sold) AS Standard_deviation

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 FROM sales<= /span>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 GROUP BY cust_id

=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 HAVING SUM(amount_sold) > 94000

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 AND AVG(amo= unt_sold) < stddev_samp(amount_sold)

) rs

ORDER BY

=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -- Total= _customer_amount DESC

=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3 DESC

=C2=A0

Hive comes back in 17 minutes with 5,9= 48 rows

=C2=A0

bl -f sales.hql > sales.log

Conne= cting to jdbc:hive2://rhes564:10010/default

Connected to: Apache Hive (version 1.2.1)

Driver: Hive JDBC (version 1.2.1)

= Transaction isolation: TRANSACTION_REPEATABLE_READ

=

Running init script /home/hduser/dba/bin/hive_on_spark_init.hql<= /span>

No rows affected (0.097 seconds)<= u>

No rows affected (0.001 seconds)<= u>

No rows affected (0.001 seconds)<= /p>

No rows affected (0.038 seconds)

INFO=C2=A0 : Warning: Using constant number 3 in order by. If you tr= y to use position alias when hive.groupby.orderby.position.alias is false, = the position alias will be ignored.

INFO= =C2=A0 :

Query Hive on Spark job[0] stag= es:

INFO=C2=A0 : 0<= /p>

INFO=C2=A0 : 1

INFO=C2=A0 = : 2

INFO=C2=A0 :

Status: Running (Hive on Spark job[0])

=

INFO=C2=A0 : Job Progress Format

CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCo= unt-FailedTasksCount)/TotalTasksCount [StageCost]

<= p class=3D"MsoNormal">INFO=C2=A0 : 2015-12-29 09:33:25,815 Stage-0_0: 0/11 Stage-1_0: 0= /1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:33:28,829 Stage-0_0: 0/11 Stage= -1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:33:31,857 Stage-0_0: 0(+= 2)/11=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 20= 15-12-29 09:33:34,875 Stage-0_0: 0(+2)/11=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0= : 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:33:37,903 Stage-0_0: 0(+2)/1= 1=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-= 12-29 09:33:40,918 Stage-0_0: 0(+2)/11=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0= /1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:33:43,939 Stage-0_0: 0(+2)/11= =C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-= 29 09:33:46,958 Stage-0_0: 0(+2)/11=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0/10= 09=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1<= /p>

INFO=C2=A0 : 2015-12-29 09:33:49,971 Stage-0_0: 0(+2)/11=C2= =A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29= 09:33:52,991 Stage-0_0: 0(+2)/11=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0/1009= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:33:56,007 Stage-0_0: 0(+2)/11=C2=A0= =C2=A0=C2=A0=C2=A0 Stage-1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 St= age-2_0: 0/1

=C2=A0=

INFO=C2=A0 : 2015-12-29 09:50:03,578 Stage-0_0: 10(+1)/11=C2= =A0=C2=A0=C2=A0 Stage-1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage= -2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:= 06,590 Stage-0_0: 10(+1)/11=C2=A0=C2=A0=C2=A0 Stage-1_0: 0/1009=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1

= INFO=C2=A0 : 2015-12-29 09:50:09,602 Stage-0_0: 10(+1)/11=C2=A0=C2=A0=C2=A0= Stage-1_0: 0/1009=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:10,606 Stage-0_= 0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 0(+2)/1009= =C2=A0=C2=A0 Stage-2_0: 0/1

= INFO=C2=A0 := 2015-12-29 09:50:11,610 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 Stage-1_0: 6(+2)/1009=C2=A0=C2=A0 Stage-2_0: 0/1=

INFO=C2=A0 : 2015-12-29 09:50:12,618 Stage-0_0: 11/11= Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 30(+2)/1009=C2=A0 = Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29= 09:50:13,622 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= Stage-1_0: 59(+2)/1009=C2=A0 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:14,626 Stage-0_0: 11/11 Finished=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 90(+2)/1009=C2=A0 Stage-2_0: 0/1<= /span>

INFO=C2=A0 : 2015-12-29 09:50:15,631 Sta= ge-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 124(+= 2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015= -12-29 09:50:16,654 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 Stage-1_0: 160(+2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:17,659 Stage-0_0: 11/11 Finished=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 193(+2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:18,663 Stage-0_= 0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 228(+2)/10= 09 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-2= 9 09:50:19,667 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 Stage-1_0: 262(+2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:20,672 Stage-0_0: 11/11 Finished=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 298(+2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:21,679 Stage-0_0: = 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 338(+2)/1009 = Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29= 09:50:22,687 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= Stage-1_0: 376(+2)/1009 Stage-2_0: 0/1

= INFO=C2=A0 : 2015-12-29 09:50:23,691 Stage-0_0: 11/11 Finished=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 417(+3)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:24,696 Stage-0_0: 11/= 11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 460(+2)/1009 Sta= ge-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:5= 0:25,699 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stag= e-1_0: 502(+2)/1009 Stage-2_0: 0/1

INFO= =C2=A0 : 2015-12-29 09:50:26,707 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 542(+2)/1009 Stage-2_0: 0/1<= u>

INFO=C2=A0 : 2015-12-29 09:50:27,712 Stage-0_0: 11/11 = Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 584(+2)/1009 Stage-= 2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:2= 8,719 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1= _0: 624(+2)/1009 Stage-2_0: 0/1

INFO=C2= =A0 : 2015-12-29 09:50:29,730 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 Stage-1_0: 667(+2)/1009 Stage-2_0: 0/1<= /u>

INFO=C2=A0 : 2015-12-29 09:50:30,736 Stage-0_0: 11/11 Fin= ished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 709(+3)/1009 Stage-2_0= : 0/1

INFO=C2=A0 : 2015-12-29 09:50:31,7= 40 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0:= 754(+3)/1009 Stage-2_0: 0/1

INFO=C2=A0 = : 2015-12-29 09:50:32,743 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 Stage-1_0: 797(+2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:33,747 Stage-0_0: 11/11 Finished= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 844(+2)/1009 Stage-2_0: 0/1=

INFO=C2=A0 : 2015-12-29 09:50:34,754 St= age-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 888(= +2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 201= 5-12-29 09:50:35,759 Stage-0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 Stage-1_0: 934(+2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:36,764 Stage-0_0: 11/11 Finished=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 981(+2)/1009 Stage-2_0: 0/1

INFO=C2=A0 : 2015-12-29 09:50:37,768 Stage= -0_0: 11/11 Finished=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 1009/10= 09 Finished=C2=A0=C2=A0 Stage-2_0: 0(+1)/1

INFO=C2=A0 : 2015-12-29 09:50:38,771 Stage-0_0: 11/11 Finished=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 Stage-1_0: 1009/1009 Finished=C2=A0=C2=A0 Stage= -2_0: 1/1 Finished

INFO=C2=A0 : Status: = Finished successfully in 1036.00 seconds

5,948 rows selected (1074.817 seconds)

=C2=A0

So it returns 5948 rows in 17 minute= s. In contrast IQ returns 5947 rows in 23 seconds

<= p class=3D"MsoNormal">=C2=A0

Sybase IQ is a columnar database so= each column is created as a fast projection index by default. In addition = I have created LF (bitmap) indexes on dimension columns (PROD_ID, CUST_ID, = TIME_ID, CHANNEL_ID, PROMO_ID). Now the query only touches CUST_ID.<= u>

=C2=A0

My suspicion is t= hat it is the Standard Deviation function stddev= _samp() that could be the bottleneck?

= =C2=A0

Thanks

=C2= =A0

Mich Talebzadeh

=C2=A0

Sybase ASE 15 Go= ld Medal Award 2008

A Winning Strategy: Running the most= Critical Financial Data on ASE 15

http://login.sybase.com/files/Produ= ct_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the = books "A Practitioner=E2=80=99s Guide to Upgrading to Sybase ASE 15= ", ISBN 978-0-9563693-0-7.

co-author "Sybase = Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

<= u>Publications due shortly:

<= span style=3D"font-size:10.0pt;font-family:"Arial",sans-serif;col= or:black">Complex Event Processing in Heterogeneous Environments= , ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Con= trasts, ISBN: 978-0-9563693-1-4, volu= me one out shortly

=C2=A0

http://talebzadehmich.wordpress.com

=C2=A0<= /p>

NOTE: The information in this email is = proprietary and confidential. This message is for the designated recipient = only, if you are not the intended recipient, you should destroy it immediat= ely. Any information in this message shall not be understood as given or en= dorsed by Peridale Technology Ltd, its subsidiaries or their employees, unl= ess expressly so stated. It is the responsibility of the recipient to ensur= e that this email is virus free, therefore neither Peridale Ltd, its subsid= iaries nor their employees accept any responsibility.<= /p>

=C2=A0

=

=C2=A0

=C2=A0

= Latest=C2=A0news=C2=A0at Handy

Handy=C2=A0just raised $50m=C2= =A0led by Fidelity

=C2=A0

3D"Image



Want to work at Han= dy? Check out our=C2=A0culture deck and open r= oles
Latest=C2=A0news=C2= =A0at Handy
Handy=C2=A0just raised $50m=C2=A0led by Fidelity

--001a1142616058ec990528239532-- --001a1142616058ec9d0528239533 Content-Type: image/jpeg; name="image001.jpg" Content-Disposition: inline; filename="image001.jpg" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: 69a999e1f0976cda_0.1 /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0a HBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIy MjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCABkAGQDASIA AhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQA AAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3 ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWm p6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEA AwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSEx BhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElK U1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3 uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD3+iii gAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKA CiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAK KKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoo ooAKKKKACiiigAooooAKKKKACiiigD//2Q== --001a1142616058ec9d0528239533--