hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (HIVE-13873) Column pruning for nested fields
Date Wed, 28 Sep 2016 08:47:20 GMT


ASF GitHub Bot commented on HIVE-13873:

GitHub user winningsix opened a pull request:

    HIVE-13873 Column pruning for nested fields


You can merge this pull request into a Git repository by running:

    $ git pull HIVE-13873

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #105
commit ea462c256f773410c7023dcbfbe365c7cc8200b6
Author: Ferdinand Xu <>
Date:   2016-09-28T01:15:51Z

    HIVE-13873 Column pruning for nested fields


> Column pruning for nested fields
> --------------------------------
>                 Key: HIVE-13873
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Logical Optimizer
>            Reporter: Xuefu Zhang
>            Assignee: Ferdinand Xu
>         Attachments: HIVE-13873.wip.patch
> Some columnar file formats such as Parquet store fields in struct type also column by
column using encoding described in Google Dramel pager. It's very common in big data where
data are stored in structs while queries only needs a subset of the the fields in the structs.
However, presently Hive still needs to read the whole struct regardless whether all fields
are selected. Therefore, pruning unwanted sub-fields in struct or nested fields at file reading
time would be a big performance boost for such scenarios.

This message was sent by Atlassian JIRA

View raw message