hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teddy Choi (JIRA)" <>
Subject [jira] [Assigned] (HIVE-16704) Replace vector code generation with stream and lambdas
Date Thu, 18 May 2017 04:39:04 GMT


Teddy Choi reassigned HIVE-16704:

> Replace vector code generation with stream and lambdas
> ------------------------------------------------------
>                 Key: HIVE-16704
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
> Hive uses vectorized execution engine. It uses code generator to cover various data types.
Because Java compiler recognizes and optimizes only simple code loop with primitive data types
and operators, not conditional branches, such as IF or SWITCH. The code generator and its
generated code is hard to read and maintain.
> Meanwhile, Hive 3 used Java 8+, which introduced lambda and new Stream API.
> Lambda with new Stream API is an excellent replacement for vector code generation. It
is more concise, because it doesn't make several copies of the template code for each class.
It's more precise, because the template code and string replacement allowed only some data
types and operators, not whole code blocks with compiler support. It's still fast, because
Java 8 compiler optimizes primitive data type operations in lambda as a loop. Therefore, it
will give more space to memory and more readability and extensibility to programmers.
> The vector code generation part is huge. So it needs to be divided in small sub-tasks.
I will start with ColumnArithmeticColumn for long, which covers LongColAddLongColumn, LongColSubtractLongColumn,
and LongColMultiplyLongColumn.

This message was sent by Atlassian JIRA

View raw message