hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14815) Support vectorization for Parquet
Date Thu, 22 Sep 2016 07:22:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15512442#comment-15512442
] 

ASF GitHub Bot commented on HIVE-14815:
---------------------------------------

GitHub user winningsix opened a pull request:

    https://github.com/apache/hive/pull/104

    HIVE-14815: Support vectorization for Parquet

    This patch includes the following changes:
    1. Implement a vectorized Page reader which support dictionary and RLE encoding.
    2. Enable vectorization for Parquet input format.
    3. Support several data types
    This is a WIP jira.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/winningsix/hive vectorization_parquet

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hive/pull/104.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #104
    
----
commit a38c766e09bc1c3728fa413767b9fbaa19a4b005
Author: Ferdinand Xu <cheng.a.xu@intel.com>
Date:   2016-09-01T22:15:31Z

    HIVE-14815: Support vectorization for Parquet

----


> Support vectorization for Parquet
> ---------------------------------
>
>                 Key: HIVE-14815
>                 URL: https://issues.apache.org/jira/browse/HIVE-14815
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ferdinand Xu
>            Assignee: Ferdinand Xu
>
> Parquet doesn't provide a vectorized reader which can be used by Hive directly. Also
for Decimal Column batch, it consists of a batch of HiveDecimal which is a Hive type which
is unknown for Parquet. To support Hive vectorization execution engine in Hive, we have to
implement the vectorized Parquet reader in Hive side. To limit the performance impacts, we
need to implement a page level vectorized reader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message