hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-553) Add BinarySortableSerDe to Hive
Date Thu, 09 Jul 2009 17:39:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729346#action_12729346

Namit Jain commented on HIVE-553:

Can you add some constants ? like 1/2 for boolean false/true = may help in some debugging
later on.

Also, it might be useful to add a dump() method which takes the inputbytebuffer and dumps
the structure ?
This can be done in a follow-up later also, may be useful for other serdes also.

> Add BinarySortableSerDe to Hive
> -------------------------------
>                 Key: HIVE-553
>                 URL: https://issues.apache.org/jira/browse/HIVE-553
>             Project: Hadoop Hive
>          Issue Type: New Feature
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-553.2.patch, HIVE-553.3.patch, HIVE-553.4.patch
> Currently the most popular SerDe in Hive is LazySimpleSerDe. LazySimpleSerDe has the
benefit of being simple (use text format to store data), but its performance may suffer in
the following cases:
> 1. For double values, we are storing them in text format which is very space-inefficient,
and both serialization and deserialization are slow;
> 2. For complex type of columns that contains a lot of levels, we are scanning the buffer
once per level, which is very inefficient.
> We should add a binary serde format that stores the data in binary format. The format
should have the following properties:
> 1. Compact: it should be space-efficient;
> 2. Fast: it should be efficiently to deserialize the data, especially for double values
and complex types.
> 3. It should support serializing NULL values.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message