hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Verlangen <ro...@us2.nl>
Subject Hive double issues while moving around RC files between clusters
Date Sat, 13 Jun 2015 12:42:43 GMT
Hi there,

I was copying around RC files from an CDH hadoop 2.0 cluster to a new HDP
hadoop 2.6 cluster.

After creating a new table with the storage options RC file and LOCATION
pointing to the right direction I can query all columns, except for the
ones that are double.

I tried querying with Hive (via tez and MR), beeline, presto. None of these
work.

The error from hive is:

java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row [Error getting row data with exception
java.lang.ArrayIndexOutOfBoundsException: 20221

    at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.byteArrayToLong(LazyBinaryUtils.java:84)

    at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryDouble.init(LazyBinaryDouble.java:43)

    at
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase$FieldInfo.uncheckedGetField(ColumnarStructBase.java:111)

    at
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase.getField(ColumnarStructBase.java:172)

    at
org.apache.hadoop.hive.serde2.objectinspector.ColumnarStructObjectInspector.getStructFieldData(ColumnarStructObjectInspector.java:67)

    at
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:140)

    at
org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:353)

    at
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:197)

    at
org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:183)

The error from presto is less verbose, but also indicates a lead:

Query 20150613_114049_00297_468ni failed: Double should be 8 bytes

Both point at something around the doubles which seems to be causing issues.

Around table settings, both serdes are
'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' which should be
correct.
The hive versions used in the old (0.8) and new (0.14) vary quite a bit,
but it is still a valid RC file (checksums match), but only the doubles are
"stuck".

I tried
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-DecimalTypeIncompatibilitiesbetweenHive0.12.0and0.13.0
but that doesn't seem to help as well.

Any idea on how I can resolve this?

Thanks in advance!

Best regards,

Robin Verlangen
*Chief Data Architect*

W http://www.robinverlangen.nl
E robin@us2.nl

<http://goo.gl/Lt7BC>
*What is CloudPelican? <http://goo.gl/HkB3D>*

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

Mime
View raw message