drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Deneche A. Hakim (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-2408) Invalid (0 length) parquet file created by CTAS
Date Fri, 20 Mar 2015 01:05:38 GMT

     [ https://issues.apache.org/jira/browse/DRILL-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Deneche A. Hakim updated DRILL-2408:
------------------------------------
    Attachment: DRILL-2408.1.patch.txt

I updated {{ParquetRecordWriter}} to delete the last file created if it's empty (no records
written to it).
I added two unit tests one that checks the default case where we try to create a table using
a query that returns 0 rows, the second case can happen if the {{ParquetRecordWriter}} flushes
it's content just after writing the last record, it will then create a new empty file.
I also updated {{TestUnionAll}} and {{TestExampleQueries}} because I kept getting random failures
caused by these tests trying to create views with the same name.
- Unit tests are passing, still waiting for the results of functional/customer/tpch100

please review it [here|https://reviews.apache.org/r/32273/]

> Invalid (0 length) parquet file created by CTAS
> -----------------------------------------------
>
>                 Key: DRILL-2408
>                 URL: https://issues.apache.org/jira/browse/DRILL-2408
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Writer
>    Affects Versions: 0.8.0
>            Reporter: Aman Sinha
>            Assignee: Deneche A. Hakim
>            Priority: Critical
>             Fix For: 0.9.0
>
>         Attachments: DRILL-2408.1.patch.txt
>
>
> We should not be creating 0 length parquet files; subsequent queries on these will fail
with the error shown below. 
> {code}
> 0: jdbc:drill:zk=local> create table tt5 as select * from cp.`tpch/region.parquet`
where 1=0;
> +------------+---------------------------+
> |  Fragment  | Number of records written |
> +------------+---------------------------+
> | 0_0        | 0                         |
> +------------+---------------------------+
> 1 row selected (0.8 seconds)
> 0: jdbc:drill:zk=local> select count(*) from tt5;
> Query failed: RuntimeException: file:/tmp/tt5/0_0_0.parquet is not a Parquet file (too
small)
> Error: exception while executing query: Failure while executing query. 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message