hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Hsu via Review Board <nore...@reviews.apache.org>
Subject Re: Review Request 62247: HIVE-17394: AvroSerde is regenerating TypeInfo objects for each nullable Avro field for every row
Date Tue, 12 Sep 2017 22:43:06 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62247/
-----------------------------------------------------------

(Updated 九月 12, 2017, 10:43 p.m.)


Review request for hive, Carl Steinbach and Ratandeep Ratti.


Changes
-------

Addressed Ratandeep's comment.


Bugs: HIVE-17394
    https://issues.apache.org/jira/browse/HIVE-17394


Repository: hive-git


Description
-------

Previously, when Avro found a nullable union in the reader schema, it would regenerate the
TypeInfo for the field for every record. This patch reuses the same TypeInfo that only needs
to be calculated once.

In our testing, we found this improved count() queries by 2x.


Diffs (updated)
-----

  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java ecfe15f59dac04bda3f8f1275babebf736608a6b



Diff: https://reviews.apache.org/r/62247/diff/2/

Changes: https://reviews.apache.org/r/62247/diff/1-2/


Testing
-------

`mvn clean package -DskipTests -Dmaven.javadoc.skip=true` succeeded.


Thanks,

Anthony Hsu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message