hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Capwell <dcapw...@gmail.com>
Subject ORC NPE while writing stats
Date Wed, 02 Sep 2015 05:51:08 GMT
We are writing ORC files in our application for hive to consume.
Given enough time, we have noticed that writing causes a NPE when
working with a string column's stats.  Not sure whats causing it on
our side yet since replaying the same data is just fine, it seems more
like this just happens over time (different data sources will hit this
around the same time in the same JVM).

Here is the code in question, and below is the exception:

final Writer writer = OrcFile.createWriter(path,
OrcFile.writerOptions(conf).inspector(oi));
try {
for (Data row : rows) {
   List<Object> struct = Orc.struct(row, inspector);
   writer.addRow(struct);
}
} finally {
   writer.close();
}


Here is the exception:

java.lang.NullPointerException: null
        at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$Builder.setMinimum(OrcProto.java:1803)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.ColumnStatisticsImpl$StringStatisticsImpl.serialize(ColumnStatisticsImpl.java:411)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.createRowIndexEntry(WriterImpl.java:1255)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.createRowIndexEntry(WriterImpl.java:775)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.createRowIndexEntry(WriterImpl.java:775)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.createRowIndexEntry(WriterImpl.java:1978)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1985)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.checkMemory(WriterImpl.java:322)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.MemoryManager.notifyWriters(MemoryManager.java:168)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.MemoryManager.addedRow(MemoryManager.java:157)
~[hive-exec-0.14.0.jar:0.14.0]
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:2276)
~[hive-exec-0.14.0.jar:


Versions:

Hadoop: apache 2.2.0
Hive Apache: 0.14.0
Java 1.7


Thanks for your time reading this email.

Mime
View raw message