hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <>
Subject [jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata
Date Tue, 22 Jan 2013 20:34:12 GMT


Ashutosh Chauhan commented on HIVE-3833:

bq. For the above, it is fairly difficult to address. In a follow-up, I can add a serde level
property, which indicates that the serde can handle different datatypes (for eg. lazySimpleSerde)
- if all the partitions of the table have serde's with this property, then we can use identityConverter.
This is kind of hacky, and am not sure if it is useful, since it should not be a common case.
Usually, the partition schema should match the table schema.

I think this really is a common case. Folks usually change the serde of an existing table
usually when they find a better FileFormat or sometime when there is a better serde, both
of which is a rare occurrence. So, I think we need to think about optimizing this case. Though
I agree approach you suggested is hacky. We need to think of a better approach, probably in
a follow-up jira.

Also thanks for updating the patch.  Some more comments on latest patch are on phabricator.
Also are we going to loose any lazy aspects of deserialization here? I guess not, because
we are just wiring up OIs. But, want to make sure. Can you verify?

> object inspectors should be initialized based on partition metadata
> -------------------------------------------------------------------
>                 Key: HIVE-3833
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.3833.10.patch, hive.3833.11.patch, hive.3833.12.patch, hive.3833.13.patch,
hive.3833.14.patch, hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, hive.3833.19.patch,
hive.3833.1.patch, hive.3833.20.patch, hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch,
hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
> Currently, different partitions can be picked up for the same input split based on the
> serdes' etc. And, we dont allow to change the schema for LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only if the
> partition schemas exactly match. The operator tree object inspectors should be based
> on the partition schema. That would give greater flexibility and also help using binary
serde with rcfile

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message