hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <>
Subject Re: Hive 1.2.1 (HDP) ArrayIndexOutOfBounds for highly compressed ORC files
Date Tue, 27 Feb 2018 02:06:56 GMT

> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(

In general HDP specific issues tend to get more attention on HCC, but this is a pretty old
issue stemming from MapReduce being designed for fairly low-memory JVMs.

The io.sort.mb size is the reason for this crash, it has a wrap-around case where sort buffers
which are > 1Gb trigger a corner case.

As odd as this might sound, if you have fewer splits the sort buffer wouldn't wrap around
enough times to generate a -ve offset. 

You can lower the to 1024Mb or lower as a slower workaround.

I ran into this issue in 2013 and started working on optimizing sort for larger buffers for
MapReduce (MAPREDUCE-4755), but ended up rewriting the entire thing & then added it to


View raw message