hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Haviv <daniel.ha...@veracity-group.com>
Subject Re: Adding new columns to parquet based Hive table
Date Wed, 14 Jan 2015 19:47:52 GMT
Hi Kumar,
Altering the table just update's Hive's metadata without updating parquet's schema.
I believe that if you'll insert to your table (after adding the column) you'll be able to
later on select all 3 columns.

Daniel

> On 14 בינו׳ 2015, at 21:34, Kumar V <kumarbuyonline@yahoo.com> wrote:
> 
> Hi,
> 
>     Any ideas on how to go about this ? Any insights you have would be helpful. I am
kinda stuck here.
> 
> Here are the steps I followed on hive 0.13
> 
> 1) create table t (f1 String, f2 string) stored as Parquet;
> 2) upload parquet files with 2 fields
> 3) select * from t; <---- Works fine.
> 4) alter table t add columns (f3 string);
> 5) Select * from t; <----- ERROR  "Caused by: java.lang.IllegalStateException: Column
f3 at index 2 does not exist 
> at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:116)
>   at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
>   at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:79)
>   at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66)
>   at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65)
> 
> 
> 
> 
> 
> On Wednesday, January 7, 2015 2:55 PM, Kumar V <kumarbuyonline@yahoo.com> wrote:
> 
> 
> Hi,
>     I have a Parquet format Hive table with a few columns.  I have loaded a lot of data
to this table already and it seems to work.
> I have to add a few new columns to this table.  If I add new columns, queries don't work
anymore since I have not reloaded the old data.
> Is there a way to add new fields to the table and not reload the old Parquet files and
make the query work ?
> 
> I tried this in Hive 0.10 and also on hive 0.13.  Getting an error in both versions.
> 
> Please let me know how to handle this.
> 
> Regards,
> Kumar. 
> 
> 

Mime
View raw message