hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Rivasseau <christ...@lefty.io>
Subject error while writing from spark: java.lang.IllegalArgumentException: Column has wrong number of index entries found: 0 expected: 1
Date Wed, 04 Nov 2015 16:22:27 GMT
Hi all,

I have been writing external hive tables (in the ORC format) from a spark
job.
I do not set any custom options (basically the code after serializing using
a serde is
just a: p.saveAsNewAPIHadoopFile(output, NullWritable.class,
Writable.class, OrcNewOutputFormat.class);
)

My job had been working for months, and just suddenly started failing
(consistently)
with the given error:

2015-11-04 14:26:47 WARN  TaskSetManager:71 - Lost task 136.1 in stage 0.0
(TID 125, lefty-hadoop-d): java.lang.IllegalArgumentException: Column has
wrong number of index entries found: 0 expected: 1
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:719)
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1615)
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1997)
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2289)
    at
org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.close(OrcNewOutputFormat.java:67)
    at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:1007)
    at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:979)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
    at org.apache.spark.scheduler.Task.run(Task.scala:64)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)



This seems somewhat similar to
https://issues.apache.org/jira/browse/HIVE-9080 except
our version of hive-exec is 1.2.1 so the issue should be long gone. Also it
may (or may not)
be relevant that this happens while writing from a custom hadoop job (and
not from inside hive).

Any ideas?

Thanks a lot,

Mime
View raw message