hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arup Malakar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9960) Hive not backward compatibilie while adding optional new field to struct in parquet files
Date Mon, 16 Mar 2015 21:20:40 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363971#comment-14363971
] 

Arup Malakar commented on HIVE-9960:
------------------------------------

[~spena] I have been using hive 0.14 with HIVE-8909 applied on it. This patch doesn't apply
cleanly on top of that, as this patch seems to be meant for trunk. On the other hand I am
unable to compile trunk against hadoop 2.3.0 as it complains about package KeyProvider etc.
I would need to port this patch for 0.14, with HIVE-8909, to be able to test it.

> Hive not backward compatibilie while adding optional new field to struct in parquet files
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-9960
>                 URL: https://issues.apache.org/jira/browse/HIVE-9960
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Arup Malakar
>            Assignee: Sergio Peña
>         Attachments: HIVE-9960.1.patch
>
>
> I recently added an optional field to a struct, when I tried to query old data with the
new hive table which has the new field as column it throws error. Any clue how I can make
it backward compatible so that I am still able to query old data with the new table definition.
>  
> I am using hive-0.14.0 release with  HIVE-8909 patch applied.
> Details:
> New optional field in a struct
> {code}
> struct Event {
>     1: optional Type type;
>     2: optional map<string, string> values;
>     3: optional i32 num = -1; // <--- New field
> }
> {code}
> Main thrift definition
> {code}
>  10: optional list<Event> events;
> {code}
> Corresponding hive table definition
> {code}
>   events array< struct <type: string, values: map<string, string>, num: int>>)
> {code}
> Try to read something from the old data, using the new table definition
> {{select events from table1 limit 1;}}
> Failed with exception:
> {code}
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException:
2                                   
> Error thrown:                                       
> 15/03/12 17:23:43 [main]: ERROR CliDriver: Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ArrayIndexOutOfBoundsException: 2                               
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException:
2                                                                                        
      
>         at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:152)       
                                                                                         
                                  
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1621)            
                                                                                         
                                  
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:267) 
                                                                                         
                                  
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)      
                                                                                         
                                  
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)     
                                                                                         
                                  
>         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)   
                                                                                         
                                  
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)             
                                                                                         
                                  
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)            
                                                                                         
                                  
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)              
                                                                                         
                                  
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                                                                                         
                              
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                                                                                         
                      
>         at java.lang.reflect.Method.invoke(Method.java:597)                         
                                                                                         
                                  
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)                      
                                                                                         
                                  
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException:
2                                                                                        
                
>         at org.apache.hadoop.hive.ql.exec.ListSinkOperator.processOp(ListSinkOperator.java:90)
                                                                                         
                        
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)       
                                                                                         
                                  
>         at org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
                                                                                         
                              
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)       
                                                                                         
                                  
>         at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
                                                                                         
                            
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)       
                                                                                         
                                  
>         at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
                                                                                         
                      
>         at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:571)
                                                                                         
                               
>         at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:563)
                                                                                         
                               
>         at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)       
                                                                                         
                                  
>         ... 12 more                                                                 
                                                                                         
                                  
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2                              
                                                                                         
                                  
>         at org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldData(ArrayWritableObjectInspector.java:140)
                                                                    
>         at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:353)
                                                                                         
                              
>         at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:306)
                                                                                         
                              
>         at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:197)
                                                                                         
                                
>         at org.apache.hadoop.hive.serde2.DelimitedJSONSerDe.serializeField(DelimitedJSONSerDe.java:60)
                                                                                         
                
>         at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:422)
                                                                                         
                   
>         at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50)
                                                                                         
     
>         at org.apache.hadoop.hive.ql.exec.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:71)
                                                                                         
                
>         at org.apache.hadoop.hive.ql.exec.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:40)
                                                                                         
                
>         at org.apache.hadoop.hive.ql.exec.ListSinkOperator.processOp(ListSinkOperator.java:87)
                                                                                         
                        
>         ... 21 more 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message