hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <>
Subject [jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
Date Sat, 28 Sep 2013 06:03:05 GMT


Sean Busbey commented on HIVE-5302:

Arg. Okay, tl;dr: I need to go back to the drawing board on finding a suitable test. Please
lower priority or close as appropriate.

Long version:

In setting up my test case I was too quick to presume AvroSerdeException showing up in the
logs was a hard failure. But there does appear to be a non-fatal problem when the partition
pruner optimization is working with a non-partitioned avro table. It attempts to make a shadow
partition to represent the whole table. Creating this partition relies on an initializer that
goes through a code path for instantiating the SerDe based on feedback just from MetaStoreUtils.

So the AvroSerDe fails during initialization (and logs a WARN about it with an AvroSerdeException),
but since this instance of the serde is never actually used, it doesn't result in a failure.

you can see this by even running the basic sanity test:

  $> ant clean package
  $> ant -Dmodule=ql -Dtestcase=TestCliDriver -Dqfile=avro_sanity_test.q test
Total time: 1 minute 15 seconds
  $> less build/ql/tmp/hive.log

In the log grep for AvroSerdeException (for me it's line 3198)

So sad Sean will need to go back to finding a case where this explodes in a way that stops

On the matter of query plan bloat, we could isolate related changes to the Avro Serde so long
as there's a way to get at table properties during SerDe initialization. That way it could
check partition-specific and then fall back to table on its own. I'll worry about that once
I find a test case.
> PartitionPruner fails on Avro non-partitioned data
> --------------------------------------------------
>                 Key: HIVE-5302
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.11.0
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>            Priority: Blocker
>              Labels: avro
>         Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt
> While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils
partition retrieval from back in HIVE-4789.
> in this case, the failure is triggered when the partition pruner is handed a non-partitioned
table and has to construct a pseudo-partition.
> e.g.
> {code}
>   INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table
WHERE col <= 9;
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message