hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mich Talebzadeh" <m...@peridale.co.uk>
Subject RE: Adding a new column to a table and updating it
Date Fri, 11 Dec 2015 23:07:04 GMT
Thanks Eugene

 

So basically as I understand when a  column can be added to an already
table. 

 

1.    The metadata for the underlying table will be updated

2.    The new column will by default have null value

3.    The existing rows cannot have new column updated to a non null value

4.    New rows can have non null values set for the new column

5.    No sql operation can be done on that column. For example select * from
<TABLE> where new_column IS NOT NULL

6.    The easiest option is to create a new table with the new column and do
insert/select from the existing table with values set for the new column

 

 

I hope this makes sense

 

 

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

 
<http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908
.pdf>
http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.
pdf

Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15",
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN:
978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume
one out shortly

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is
the responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 

From: Eugene Koifman [mailto:ekoifman@hortonworks.com] 
Sent: 11 December 2015 22:52
To: user@hive.apache.org
Subject: Re: Adding a new column to a table and updating it

 

Schema evolution is not supported for Acid tables.

https://issues.apache.org/jira/browse/HIVE-11981 fixes it but hasn't been
released yet.

There is no quick way to recover data.  

You could write a script to use ORC FileDump utility to look at actual files
in the table 

to group them into sets where each set has the same schema.   Then you could
create new Acid tables with schemas matching each set and copy 

the files to new tables.   You'd need to make sure to preserve
"delta_xxx_yyy/bucket_zzz" part of the path.

Then you'd be able to read each new table.  You'd need to do this in the
same Hive warehouse so that the metastore still contains

all the metadata about transactions.

 

Eugene 

 

From: Mich Talebzadeh <mich@peridale.co.uk <mailto:mich@peridale.co.uk> >
Reply-To: "user@hive.apache.org <mailto:user@hive.apache.org> "
<user@hive.apache.org <mailto:user@hive.apache.org> >
Date: Friday, December 11, 2015 at 12:56 AM
To: "user@hive.apache.org <mailto:user@hive.apache.org> "
<user@hive.apache.org <mailto:user@hive.apache.org> >
Subject: Adding a new column to a table and updating it

 

I can use Hive to add a new column to an existing table (it is an ORC
transactional table) with data in it.

 

Example:

 

hive> alter table t add columns (new_col varchar(30));

OK

 

Now that column is created with NULL value. The operation is fast as the DDL
in metadata is altered.

 

However

 

1.        I cannot update that new new_col from NULL to something else. 

hive> update t set new_col = 'New value';

2.       You will get java.lang.ArrayIndexOutOfBoundsException

3.       Although I can do insert/select new rows to that table with value
for the new column, I cannot do any operation on it. For example

hive> select count(1) from t where new_col is not null;

4.       This causes the same error

java.lang.ArrayIndexOutOfBoundsException: 27

 

Is there a quick fix to this short of rebuilding data?

 

Thanks

 

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

 
<http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908
.pdf>
http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.
pdf

Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15",
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN:
978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume
one out shortly

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is
the responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 


Mime
View raw message