hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajat Khandelwal <>
Subject Schema evolution in hive tables
Date Wed, 07 Dec 2016 10:32:37 GMT
So far, my understanding has been that in Hive tables, each partition has a
schema and whenever you add a partition to a Hive table, the current table
schema is copied into the partition schema. This should allow a seamless
evolution of the schema. Recently I came across something that contradicts
this. Hence, looking for some clarification.

So we have a table, and we have fixed ORC format for it. The table has a
schema say (a,b,c,d). We added one partition. The data is stored in the
same order. When we query (a, b) from this partition, the data has the two
columns in the correct order. Now we go ahead and change the schema of the
*table* to (b,c,d,a). But the schema of the partition is still (a,b,c,d) as
verified by doing describe extended on the partition. Now we issue the same
query on the old partition projecting (a,b). Surprisingly, it projects (b,
c). Is this the expected behaviour or am I missing something obvious?

Coming back to the question of schema evolution, as business usecases grow,
there is a need to add fields in the table. So am I restricted by hive to
add my fields at the end only?


Rajat Khandelwal
Software Engineer

The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 

View raw message