hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jitendra Nath Pandey (JIRA)" <>
Subject [jira] [Commented] (HIVE-4160) Vectorized Query Execution in Hive
Date Wed, 13 Mar 2013 19:58:14 GMT


Jitendra Nath Pandey commented on HIVE-4160:

    This will be an incremental work in multiple phases with no regression on current system.
We will publish a design/scope document very soon.
    The main idea behind the proposal is to transform the execution engine to process a row
batch at a time instead of a single row. The row batch will consist of column vectors and
each operator will process the whole column vector at a time. The column vector will consist
of array(s) of primitive types as far as possible.
    The expressions will be implemented for various data types using pre-compiled templates.
The appropriate expressions will be added to the operators based on data types.
    A vectorized iterator interface will be implemented by the file formats to provide vectorized
input to the operator tree. 

> Vectorized Query Execution in Hive
> ----------------------------------
>                 Key: HIVE-4160
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>   Hive query execution engine currently processes one row at a time. A single row of
data goes through all the operators before next row can be processed. This mode of processing
is very inefficient in terms of CPU usage. Research has demonstrated that this yields very
low instructions per cycle [MonetDB]. Also currently hive heavily relies on lazy deserialization
and data columns go through a layer of object inspectors that identify column type, de-serialize
data and determine appropriate expression routines in the inner loop. These layers of virtual
method calls further slow down the processing.
> Reference:

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message