drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-3208) Hive : Tpch (SF 0.01) query 10 fails with a system error when the data is backed by hive tables
Date Fri, 29 May 2015 21:56:17 GMT

     [ https://issues.apache.org/jira/browse/DRILL-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Venki Korukanti resolved DRILL-3208.
------------------------------------
    Resolution: Invalid

> Hive : Tpch (SF 0.01) query 10 fails with a system error when the data is backed by hive
tables
> -----------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3208
>                 URL: https://issues.apache.org/jira/browse/DRILL-3208
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive
>            Reporter: Rahul Challapalli
>            Assignee: Venki Korukanti
>         Attachments: customer.parquet, error.log, lineitem_nodate.parquet, nation.parquet,
orders_nodate.parquet, tpch.ddl
>
>
> git.commit.id.abbrev=6f54223
> I created hive tables on top of tpch parquet data. (Attached the hive ddl script). Since
hive does not support date in parquet serde, I regenerated the parquet files for orders and
lineitem to use string for the date fields. Remaining files do not have a date column.
> When I executed query 10 in the tpch suite, it failed with a system error.
> {code}
> 0: jdbc:drill:schema=dfs_eea> use hive.tpch01_parquet_nodate;
> +-------+---------------------------------------------------------+
> |  ok   |                         summary                         |
> +-------+---------------------------------------------------------+
> | true  | Default schema changed to [hive.tpch01_parquet_nodate]  |
> +-------+---------------------------------------------------------+
> 1 row selected (0.091 seconds)
> 0: jdbc:drill:schema=dfs_eea>
> select
>   c.c_custkey,
>   c.c_name,
>   sum(l.l_extendedprice * (1 - l.l_discount)) as revenue,
>   c.c_acctbal,
>   n.n_name,
>   c.c_address,
>   c.c_phone,
>   c.c_comment
> from
>   customer c,
>   orders o,
>   lineitem l,
>   nation n
> where
>   c.c_custkey = o.o_custkey
>   and l.l_orderkey = o.o_orderkey
>   and cast(o.o_orderdate as date) >= date '1994-03-01'
>   and cast(o.o_orderdate as date) < date '1994-03-01' + interval '3' month
>   and l.l_returnflag = 'R'
>   and c.c_nationkey = n.n_nationkey
> group by
>   c.c_custkey,
>   c.c_name,
>   c.c_acctbal,
>   c.c_phone,
>   n.n_name,
>   c.c_address,
>   c.c_comment
> order by
>   revenue desc
> limit 20;
> Error: SYSTEM ERROR: 
> Fragment 0:0
> [Error Id: 1d327ae0-1cf2-4776-acd3-8eef6cca4b6a on qa-node191.qa.lab:31010] (state=,code=0)
> {code}
> I tried running the above query using dfs instead of hive and it worked as expected.
> I attached the newly generated parquet files and the hive ddl for creating hive tables.
Let me know if you need anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message