hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan LeCompte <>
Subject Adding new columns to existing Hive tables
Date Tue, 09 Mar 2010 20:24:50 GMT
It looks like we can add columns to existing tables via:

ALTER TABLE table_name ADD|REPLACE COLUMNS (col_name data_type
[COMMENT col_comment], ...)

However, I see the following comment in the Hive docs:

"NOTE: These commands will only modify Hive's metadata, and will NOT
reorganize or reformat existing data. Users should make sure the actual
data layout conforms with the metadata definition."

Question: If we already have a table that has lots of data in it, and
I execute the above statement to add a column, will I still be able to
query existing data? Or do I need to re-import somehow all of the data
and fill in a value for the new column? The idea is to be able to add
a new column, and make sure that the column value exists for all NEW
partitions in the same table. I would hate to have to reload all of
the old data just to specify a NULL value for the new column.

Will this work as expected or a data re-load is necessary every time
we add a new column to be able to still query older data?



View raw message