hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-235) DynamicSerDe does not work with Thrift Protocols that can have missing fields for null values
Date Tue, 20 Jan 2009 02:46:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665318#action_12665318
] 

Joydeep Sen Sarma commented on HIVE-235:
----------------------------------------

looks good for handling the null columns case.

btw - getNumFields and ordered_types have the same length. doesn't seem to me that the SerDe
assumes that the record has to have all fields (bails out when it hits stop even if all fields
have not been read). So if we are trying to solve any outofbounds issue - not sure this is
going to resolve it.


> DynamicSerDe does not work with Thrift Protocols that can have missing fields for null
values
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-235
>                 URL: https://issues.apache.org/jira/browse/HIVE-235
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>            Priority: Blocker
>         Attachments: HIVE-235.1.patch
>
>
> The current DynamicSerDe code assumes all fields are there and no fields are missing.
> However Thrift Protocols can have missing fields, in case the field is null.
> In that case, DynamicSerDe may commit 2 behavior:
> 1. array index out of bound error because DynamicSerDe assumes the number of fields in
the record should be equal to that in the DDL;
> 2. fields with null values will take the value from the last record. This may produce
wrong result for queries.
> In order to fix this, we need to:
> 1. Pass ObjectInspector/TypeInfo recursively so that we know the number of fields when
deserializing the record.
> 2. Clear out fields that are missing from the record.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message