hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuntao Jia (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-640) Add LazyBinarySerDe to Hive
Date Fri, 31 Jul 2009 18:18:14 GMT

     [ https://issues.apache.org/jira/browse/HIVE-640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yuntao Jia updated HIVE-640:
----------------------------

    Attachment: HIVE-640.3.patch

A newer patch, which incorporates most of Zheng's comments:

1/ Remove duplicated functions in TestLazyBinarySerDe which exist in TestBinarySortableSerDe
already. By make those function public static in TestBinarySortableSerDe, TestLazyBinarySerDe
shares the same functions.
2/ Fixed the copy constructors in all LazyBinaryPrimitive classes so that the copy constructor
copies the value of the data as well.
3/ Cached ((ListObjectInspector)oi).getListElementObjectInspector()  in LazyBinaryArray to
a local variable.
4/ Added a private function in LazyBinarySerDe so that serializing a row and serializing a
struct call the same function. Before, they were using different but quite similar code pieces.

5/ Simplified parsing code in LazyBinaryArray/LazyBinaryStruct/LazyBinaryMap. In particular,
if valueIsNull[i] in LazyBinaryArray is true, then we don't need to update valueStart[i],
valueLength[i] and lastElementEnd. Similar changes are done in LazyBinaryStruct and LazyBinaryMap.

Thanks Zheng for the reviews. 


> Add LazyBinarySerDe to Hive
> ---------------------------
>
>                 Key: HIVE-640
>                 URL: https://issues.apache.org/jira/browse/HIVE-640
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: Yuntao Jia
>         Attachments: HIVE-640.1.patch, HIVE-640.2.patch, HIVE-640.3.patch
>
>
> LazyBinarySerDe will serialize the data in binary format while supporting LazyDeserialization.
> This will be used as the SerDe for value between map and reduce, and also between different
map-reduce jobs.
> This will help improve the performance of Hive a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message