hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-625) Use of BinarySortableSerDe for serialization of the value between map and reduce boundary
Date Fri, 10 Jul 2009 18:59:15 GMT

    [ https://issues.apache.org/jira/browse/HIVE-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729772#action_12729772
] 

Zheng Shao commented on HIVE-625:
---------------------------------

For this particular case, I think predicate push down will push the filter to the mapper side.
And partition pruner will prune out all columns that are not accessed.
So, the reducer will probably read all columns that are passed through map and reduce boundary.

I agree there can still be other opposite cases - but that won't appear often. I can also
make this SerDe configurable if that's a better idea.

What do you think?


> Use of BinarySortableSerDe for serialization of the value between map and reduce boundary
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-625
>                 URL: https://issues.apache.org/jira/browse/HIVE-625
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-625.1.patch
>
>
> We currently use LazySimpleSerDe which serializes double to text format. Before we have
LazyBinarySerDe, we should switch to BinarySortableSerDe because that's still much faster
than LazySimpleSerDe.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message