hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anson Abraham <anson.abra...@gmail.com>
Subject How to handle for new columns?
Date Thu, 01 Mar 2012 20:06:18 GMT
If i have a hive table, which is an external table, and have my "log files"
being read into it, if a new file is imported into the hdfs and the file
has a new column, how can i get hive to handle the old files w/o the new
column, if I do an alter adding column into the hive table.
So example, i have a few files w/ these fields:

empid, empname, deptno

and so my hive table
CREATE EXTERNAL TABLE IF NOT EXISTS Employee (
empid BIGINT
,empname string
deptno BIGINT
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE LOCATION 'hdfs://namenode1/employee/';



but if I have a new file imported into the hdfs directory w/ a new column
empid, empname, deptno, salary

I can't do an alter of the employee table adding salary b/c of the
historical files.  I used external tables b/c I wanted the table to
dynamically get all the log files into hive table, when a new file is
generated.

I know the long way is basically adding fields through all the old files,
but prefer of a more scalable way to do this.  Anyone know of any?
Thanks

Mime
View raw message