Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D18A17FD4 for ; Wed, 28 Jan 2015 23:15:35 +0000 (UTC) Received: (qmail 97220 invoked by uid 500); 28 Jan 2015 23:15:35 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 97129 invoked by uid 500); 28 Jan 2015 23:15:35 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 96986 invoked by uid 500); 28 Jan 2015 23:15:35 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 96966 invoked by uid 99); 28 Jan 2015 23:15:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jan 2015 23:15:35 +0000 Date: Wed, 28 Jan 2015 23:15:35 +0000 (UTC) From: =?utf-8?Q?Sergio_Pe=C3=B1a_=28JIRA=29?= To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-9502) Parquet cannot read Map types from files written with Hive <= 0.12 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-9502?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Pe=C3=B1a updated HIVE-9502: ------------------------------ Attachment: alltypesparquet > Parquet cannot read Map types from files written with Hive <=3D 0.12 > ------------------------------------------------------------------ > > Key: HIVE-9502 > URL: https://issues.apache.org/jira/browse/HIVE-9502 > Project: Hive > Issue Type: Bug > Affects Versions: 0.14.0 > Reporter: Sergio Pe=C3=B1a > Assignee: Sergio Pe=C3=B1a > Attachments: HIVE-9502.1.patch, HIVE-9502.2.patch, HIVE-9502.3.pa= tch, alltypesparquet > > > When reading a Parquet file written by Hive <=3D 0.12, the following erro= r is thrown: > {noformat} > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.hive.ql.io.parquet.serde.AbstractParquetMapI= nspector.getMap(AbstractParquetMapInspector.java:73) > at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(L= azySimpleSerDe.java:519) > at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeFi= eld(LazySimpleSerDe.java:443) > at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(L= azySimpleSerDe.java:427) > at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(File= SinkOperator.java:582) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:= 796) > at org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOp= erator.java:51) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:= 796) > at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(Select= Operator.java:87) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:= 796) > at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(Tab= leScanOperator.java:92) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:= 796) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator= .java:539) > ... 9 more > {noformat} > This is because old versions of Hive (<=3D 0.12) write Map types using th= e following schema: > {noformat} > optional group m1 (MAP_KEY_VALUE) { > =09repeated group map { > =09=09required binary key; > =09=09optional binary key; > =09} > }=09 > {noformat} > PARQUET-113 mentions new annotations for Parquet nested types.=20 > https://github.com/rdblue/incubator-parquet-format/blob/PARQUET-113-add-l= ist-and-map-spec/LogicalTypes.md#maps > And now the correct schema is: > {noformat} > optional group m1f (MAP) { > =09repeated group map (MAP_KEY_VALUE) { > =09=09required binary key; > =09=09optional binary key; > =09} > } > {noformat} > We should be backwards compatible to the old schema as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)