hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-270) Add a lazy-deserialized SerDe for space and cpu efficient serialization of rows with primitive types
Date Wed, 11 Feb 2009 20:42:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672764#action_12672764
] 

Joydeep Sen Sarma commented on HIVE-270:
----------------------------------------

looks pretty good!

for the LazyString conversion - can use static function Text.decode. The text.append will
do an unnecessary byte copy into the text's internal byte array.

LazySimpleStructObjectInspector.java:  public List<Object> getStructFieldsDataAsList(Object
data) {
this is probably not used anywhere - but if data==null seems like we should just return null?
(based on looking at other object inspectors)

parse(): comparison for null sequence: the compare method is somewhat generic. since we only
care for equality - can do a simple comparison of lengths first to find if things are unequal
(should be a little faster) and do full comparison only if lengths are equal.

it may be possible to speed up the serialize considerably as well (go directly from Primitive
types to bytes and append to a bytebuffer) - but would make sense to punt on that.

> Add a lazy-deserialized SerDe for space and cpu efficient serialization of rows with
primitive types
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-270
>                 URL: https://issues.apache.org/jira/browse/HIVE-270
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Serializers/Deserializers
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-270.1.patch, HIVE-270.3.patch
>
>
> We want to add a lazy-deserialized SerDe for space and cpu efficient serialization of
rows with primitive types.
> This SerDe will share the same format as MetadataTypedColumnsetSerDe/TCTLSeparatedProtocol
to be backward compatible.
> This SerDe will be used to replace the default table SerDe, and the SerDe used to communicate
with user scripts.
> For simplicity, we don't plan to support nested structure with this SerDe.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message