hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <>
Subject [jira] [Commented] (HIVE-3833) object inspectors should be initialized based on partition metadata
Date Wed, 16 Jan 2013 07:32:12 GMT


Namit Jain commented on HIVE-3833:

bq. Seems to me, this patch will take away the flexibility of combining partitions of different
schemas in one split. That sounds like lesser flexibility instead of more.

No, I am not sure whether I added a test for that, but that should be possible. We know when
a partition is being changed.

bq. Shouldn't we be fixing LazyColumnarBinarySerde in that case, instead of restricting combining
of partitions of different schemas in one split?

That is not the problem (combining partitions) - the problem is that any binary serde will
use the datatypes for serialization, i.e it will have different storage for int and string
- otherwise, what is the point of it being binary ? In case case, unless we use the partition
schema (instead of
table schema), we can get wrong results.
> object inspectors should be initialized based on partition metadata
> -------------------------------------------------------------------
>                 Key: HIVE-3833
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.3833.10.patch, hive.3833.11.patch, hive.3833.12.patch, hive.3833.13.patch,
hive.3833.14.patch, hive.3833.1.patch, hive.3833.2.patch, hive.3833.3.patch, hive.3833.4.patch,
hive.3833.5.patch, hive.3833.6.patch, hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
> Currently, different partitions can be picked up for the same input split based on the
> serdes' etc. And, we dont allow to change the schema for LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only if the
> partition schemas exactly match. The operator tree object inspectors should be based
> on the partition schema. That would give greater flexibility and also help using binary
serde with rcfile

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message