hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Hsu <a...@linkedin.com>
Subject Review Request 54094: HIVE-15190: Field names are not preserved in ORC files written with ACID
Date Sat, 26 Nov 2016 23:03:35 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54094/
-----------------------------------------------------------

Review request for hive.


Bugs: HIVE-15190
    https://issues.apache.org/jira/browse/HIVE-15190


Repository: hive-git


Description
-------

Previously, when writing to an ACID ORC table, the file written to disk would have a schema
of `struct<...(acid columns)...,row:struct<_col0:int,_col1:string,...>>`, using
virtual column names `_col0`, `_col1`, etc., instead of the actual table column names. This
patch fixes this issue.

Having the actual table column names in the ORC file itself is needed when doing schema evolution
based on field names: https://issues.apache.org/jira/browse/ORC-54


Diffs
-----

  orc/src/java/org/apache/orc/impl/SchemaEvolution.java 7379de93a7f39d734ef7695c197bd9f24bc84321

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 53660206e3f59c37be261b1a9796f04721a244f3

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java efde2db482367f1037c486df9c5cabd67b1368ed

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 492c64c29e8d4f38d857381bc375074e06868f7c

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java 75c7680e267ab44e426d0b21c6fd6dce6a352bbd

  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 49ba6675bae5b3e6d8bf1fa2e9ed8d2a27b7f83a


Diff: https://reviews.apache.org/r/54094/diff/


Testing
-------

Added unit test. Also ran some of the existing ACID tests and they still passed.


Thanks,

Anthony Hsu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message