hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-17085) ORC file merge/concatenation should do full schema check
Date Thu, 13 Jul 2017 21:42:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Prasanth Jayachandran updated HIVE-17085:
-----------------------------------------
    Description: 
ORC merging/concatenation compatibility check just looks for column count match at outer level.
ORC schema evolution now supports inner structs as well. With that outer level column count
will match but inner column level will not match. Compatibility check should do full schema
match before merging/concatenation. This issue will not cause data loss but will cause task
failures with exception like below
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close OrcFileMergeOperator
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
	at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
	... 16 more
Caused by: java.lang.IllegalArgumentException: Column has wrong number of index entries found:
0 expected: 1
	at org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
	at org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
	at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
	at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
	... 19 more
{code}

Concatenation should also make sure writer version is matching (it currently checks only file
version match).

  was:
ORC merging/concatenation compatibility check just looks for column count match at outer level.
ORC schema evolution now supports inner structs as well. With that outer level column count
will match but inner column level will not match. Compatibility check should do full schema
match before merging/concatenation. This issue will not cause data loss but will cause task
failures with exception like below
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close OrcFileMergeOperator
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
	at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
	... 16 more
Caused by: java.lang.IllegalArgumentException: Column has wrong number of index entries found:
0 expected: 1
	at org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
	at org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
	at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
	at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
	... 19 more
{code}


> ORC file merge/concatenation should do full schema check
> --------------------------------------------------------
>
>                 Key: HIVE-17085
>                 URL: https://issues.apache.org/jira/browse/HIVE-17085
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 2.2.0, 2.3.0, 3.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>
> ORC merging/concatenation compatibility check just looks for column count match at outer
level. ORC schema evolution now supports inner structs as well. With that outer level column
count will match but inner column level will not match. Compatibility check should do full
schema match before merging/concatenation. This issue will not cause data loss but will cause
task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close OrcFileMergeOperator
> 	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
> 	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
> 	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
> 	at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
> 	... 16 more
> Caused by: java.lang.IllegalArgumentException: Column has wrong number of index entries
found: 0 expected: 1
> 	at org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
> 	at org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
> 	at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
> 	at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
> 	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
> 	at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
> 	... 19 more
> {code}
> Concatenation should also make sure writer version is matching (it currently checks only
file version match).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message