drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Westin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-1858) Parquet reader should only explicitly fill in data for a column requested but not in the file if there are no valid columns found
Date Fri, 20 Feb 2015 19:58:11 GMT

     [ https://issues.apache.org/jira/browse/DRILL-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Westin updated DRILL-1858:
--------------------------------
    Component/s: Storage - Parquet

> Parquet reader should only explicitly fill in data for a column requested but not in
the file if there are no valid columns found
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-1858
>                 URL: https://issues.apache.org/jira/browse/DRILL-1858
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>            Reporter: Jason Altekruse
>            Assignee: Deneche A. Hakim
>             Fix For: Future
>
>
> If columns are requested from a parquet file, that do not appear in the particular file
(users may have a directory full of files that share some columns but not others) then we
do not need to create a vector to represent these columns in most cases. These columns can
be materialized (as a vector filled with nulls) later when they are referenced in other parts
of the query, such as a filter or join condition. The current behavior of the reader is to
always fill vectors for these types of columns, but this just creates extra payload to ship
around until the vectors are actually referenced.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message