hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Shao <zsh...@gmail.com>
Subject Re: Migration Strategy
Date Mon, 26 Jan 2009 23:45:36 GMT
Hi Josh,

DynamicSerDe with TCTLSeparatedProtocol will also treat missing columns from
data as NULL.

Basically, if you create the table without specifying the SerDe or Protocol,
then it should be Ok to add a new column in the schema, and for old data,
that new column will be NULL.


Zheng

On Mon, Jan 26, 2009 at 3:31 PM, Ashish Thusoo <athusoo@facebook.com> wrote:

>  If you are adding a column at the end of the table, you should be ok with
> the old data staying in the state that it was provided it is created with
> MetadataTypedColumnSetSerDe (I am not sure what happens with DynamicSerDe).
> MetadataTypedColumnSetSerdDe interprets missing columns at the end as nulls
> in the old data. Note this only works when adding columns at the end without
> changing names...
>
> Ashish
>
>  ------------------------------
> *From:* Josh Ferguson [mailto:josh@besquared.net]
> *Sent:* Monday, January 26, 2009 3:06 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Migration Strategy
>
> What's the current strategy for when you have a production system and you
> realize you need to add another column to the table or do some other thing?
> Seems like you'd have to make a new table, run a script to transform and
> load all your old data to the new table, and then remove the old table. Is
> this what is currently being done?
> Josh F.
>



-- 
Yours,
Zheng

Mime
View raw message