Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D480318E5C for ; Wed, 8 Jul 2015 17:39:09 +0000 (UTC) Received: (qmail 71104 invoked by uid 500); 8 Jul 2015 17:39:05 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 70943 invoked by uid 500); 8 Jul 2015 17:39:04 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 70617 invoked by uid 99); 8 Jul 2015 17:39:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Jul 2015 17:39:04 +0000 Date: Wed, 8 Jul 2015 17:39:04 +0000 (UTC) From: "Venki Korukanti (JIRA)" To: dev@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (DRILL-3271) Hive : Tpch 01.q fails with a verification issue for SF100 dataset MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti resolved DRILL-3271. ------------------------------------ Resolution: Invalid Just had a discussion with [~adeneche]. Floating point differences between runs are due to truncation in arithmetic operations and the order of data received at aggregator. The differences here still seems to be in acceptable range. We need to update the margin error constant in test framework. > Hive : Tpch 01.q fails with a verification issue for SF100 dataset > ------------------------------------------------------------------ > > Key: DRILL-3271 > URL: https://issues.apache.org/jira/browse/DRILL-3271 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive > Reporter: Rahul Challapalli > Assignee: Venki Korukanti > Fix For: 1.2.0 > > Attachments: tpch100_hive.ddl > > > git.commit.id.abbrev=5f26b8b > Query : > {code} > select > l_returnflag, > l_linestatus, > sum(l_quantity) as sum_qty, > sum(l_extendedprice) as sum_base_price, > sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, > sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, > avg(l_quantity) as avg_qty, > avg(l_extendedprice) as avg_price, > avg(l_discount) as avg_disc, > count(*) as count_order > from > lineitem > where > l_shipdate <= date '1998-12-01' - interval '120' day (3) > group by > l_returnflag, > l_linestatus > order by > l_returnflag, > l_linestatus; > {code} > The 4th column appears to have some differences. Not sure if it is within acceptable range > Expected : > {code} > A F 3.775127758E9 5.660776097194428E12 5.377736398183942E12 5.592847429515948E12 25.499370423275426 38236.11698430475 0.05000224353079674 148047881 > N O 7.269911583E9 1.0901214476134316E13 1.0356163586785008E13 1.077041889123738E13 25.499873337396807 38236.997134222445 0.04999763132401859 285095988 > R F 3.77572497E9 5.661603032745362E12 5.378513563915394E12 5.593662252666902E12 25.50006628406532 38236.69725845312 0.05000130433952159 148067261 > N F 9.8553062E7 1.4777109838597995E11 1.403849659650348E11 1.459997930327757E11 25.501556956882876 38237.19938880449 0.04998528433803118 3864590 > {code} > Actual : > {code} > A F 3.775127758E9 5.660776097194352E12 5.37773639818398E12 5.592847429515874E12 25.499370423275426 38236.11698430423 0.0500022435305286 148047881 > N O 7.269911583E9 1.0901214476134352E13 1.0356163586784926E13 1.0770418891237576E13 25.499873337396807 38236.99713422257 0.04999763132535226 285095988 > R F 3.77572497E9 5.661603032745394E12 5.378513563915313E12 5.593662252666848E12 25.50006628406532 38236.69725845333 0.05000130433925318 148067261 > N F 9.8553062E7 1.4777109838598022E11 1.4038496596503506E11 1.45999793032776E11 25.501556956882876 38237.19938880456 0.049985284338093884 3864590 > {code} > The data is 100 GB, so I couldn't attach it here. > I attached the hive ddl. Let me know if you need anything else -- This message was sent by Atlassian JIRA (v6.3.4#6332)