hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata
Date Tue, 22 Jan 2013 12:00:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559572#comment-13559572
] 

Namit Jain commented on HIVE-3833:
----------------------------------

bq. In case of identity converter, there is no conversion cost, but in case of non-identity
this will be worse than current impl, since converter will examine every single column value,
which wasn't the case earlier. However, it's not clear how expensive this would be?

For the above, it is fairly difficult to address. In a follow-up, I can add a serde level
property, which indicates that the serde can handle different datatypes (for eg.
lazySimpleSerde) - if all the partitions of the table have serde's with this property, then
we can use identityConverter. This is kind of hacky, and am not sure if it is
useful, since it should not be a common case. Usually, the partition schema should match the
table schema.
                
> object inspectors should be initialized based on partition metadata
> -------------------------------------------------------------------
>
>                 Key: HIVE-3833
>                 URL: https://issues.apache.org/jira/browse/HIVE-3833
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.3833.10.patch, hive.3833.11.patch, hive.3833.12.patch, hive.3833.13.patch,
hive.3833.14.patch, hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, hive.3833.19.patch,
hive.3833.1.patch, hive.3833.20.patch, hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch,
hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split based on the
> serdes' etc. And, we dont allow to change the schema for LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only if the
> partition schemas exactly match. The operator tree object inspectors should be based
> on the partition schema. That would give greater flexibility and also help using binary
serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message