hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call
Date Tue, 30 Jun 2015 21:35:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609116#comment-14609116
] 

Sergey Shelukhin edited comment on HIVE-10940 at 6/30/15 9:34 PM:
------------------------------------------------------------------

{noformat}
// lets take a look at the operator memory requirements.
{noformat}
this comment looks like it was c/p-ed.

Can you add a comment to where the new optimizer is added indicating that it should be added
at the end (for people who will be adding more optimizers)?

serializedFilterObject is never set anymore. Set or remove?


was (Author: sershe):
{noformat}
// lets take a look at the operator memory requirements.
{noformat}
this comment seems like it was c/p-ed.

Can you add comment to where the new optimizer is added indicating that it should run last?

serializedFilterObject is never set anymore. Set or remove?

> HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-10940
>                 URL: https://issues.apache.org/jira/browse/HIVE-10940
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats
>    Affects Versions: 1.2.0
>            Reporter: Gopal V
>            Assignee: Sergey Shelukhin
>             Fix For: 2.0.0
>
>         Attachments: HIVE-10940.01.patch, HIVE-10940.02.patch, HIVE-10940.patch
>
>
> {code}
>     String filterText = filterExpr.getExprString();
>     String filterExprSerialized = Utilities.serializeExpression(filterExpr);
> {code}
> the serializeExpression initializes Kryo and produces a new packed object for every split.
> HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters.
> And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message