drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-1562) Parquet Writer hangs when converting TPCH text data (SF100)
Date Mon, 17 Nov 2014 01:17:35 GMT

     [ https://issues.apache.org/jira/browse/DRILL-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jacques Nadeau resolved DRILL-1562.
-----------------------------------
    Resolution: Fixed

> Parquet Writer hangs when converting TPCH text data (SF100)
> -----------------------------------------------------------
>
>                 Key: DRILL-1562
>                 URL: https://issues.apache.org/jira/browse/DRILL-1562
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>            Reporter: Abhishek Girish
>            Assignee: Parth Chandra
>             Fix For: 0.7.0
>
>         Attachments: hang.log
>
>
> Converting TPCH text data into Parquet hangs. 
> Table name: lineitem
> Table size: ~80GB
> Input format: psv ('|' separated)
> Number of drillbits: 4
> DRILL_MAX_DIRECT_MEMORY="64G"
> DRILL_MAX_HEAP="32G"
> Query:
> > create table lineitem as select
> . . . . . . . . . . . . . . . . . >     cast(columns[0] as int) l_orderkey,
> . . . . . . . . . . . . . . . . . >     cast(columns[1] as int) l_partkey,
> . . . . . . . . . . . . . . . . . >     cast(columns[2] as int) l_suppkey,
> . . . . . . . . . . . . . . . . . >     cast(columns[3] as int) l_linenumber,
> . . . . . . . . . . . . . . . . . >     cast(columns[4] as double) l_quantity,
> . . . . . . . . . . . . . . . . . >     cast(columns[5] as double) l_extendedprice,
> . . . . . . . . . . . . . . . . . >     cast(columns[6] as double) l_discount,
> . . . . . . . . . . . . . . . . . >     cast(columns[7] as double) l_tax,
> . . . . . . . . . . . . . . . . . >     cast(columns[8] as char(1)) l_returnflag,
> . . . . . . . . . . . . . . . . . >     cast(columns[9] as char(1)) l_linestatus,
> . . . . . . . . . . . . . . . . . >     cast(columns[10] as date) l_shipdate,
> . . . . . . . . . . . . . . . . . >     cast(columns[11] as date) l_commitdate,
> . . . . . . . . . . . . . . . . . >     cast(columns[12] as date) l_receiptdate,
> . . . . . . . . . . . . . . . . . >     cast(columns[13] as char(25)) l_shipinstruct,
> . . . . . . . . . . . . . . . . . >     cast(columns[14] as char(10)) l_shipmode,
> . . . . . . . . . . . . . . . . . >     cast(columns[15] as varchar(200)) l_comment
> . . . . . . . . . . . . . . . . . > from dfs.`/tpch-text/scale100/lineitem` lineitem;
> +------------+---------------------------+
> |  Fragment  | Number of records written |
> +------------+---------------------------+
> | 1_58       | 4072947                   |
> | 1_90       | 4088667                   |
> | 1_38       | 4072639                   |
> ...
> ...
> | 1_14       | 6109440                   |
> <hangs>
> ...
> The drill-bit endpoint gets set to null. And the point of hang varies on each run. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message