hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vic0777 <vic0...@163.com>
Subject Re:Re: Updating not reusing the blocks in previous and former version
Date Tue, 02 Dec 2014 09:06:13 GMT


The document describes how transaction works and what the data layout is: https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions.
See the "Basic design" section. HDFS is immutable. Hive creates a delta directory for every
transaction and merges it when read, so it does not written to the same block.

Wantao






At 2014-12-02 16:58:26, "unmesha sreeveni" <unmeshabiju@gmail.com> wrote:

Why hive "UPADTE" is not reusing the blocks.
the update is not written to same block, why is it so ?




On Tue, Dec 2, 2014 at 10:50 AM, unmesha sreeveni <unmeshabiju@gmail.com> wrote:

I tried to update my record in hive previous version and also tried out update in hive 0.14.0.
The newer version which support hive.


I created a table with 3 buckets with 180 MB. In my warehouse the data get stored into 3 different
blocks
 
delta_0000012_0000012 
--- Block ID: 1073751752
--- Block ID: 1073751750
--- Block ID: 1073751753


After doing an update


I am getting 2 directories


delta_0000012_0000012 
--- Block ID: 1073751752
--- Block ID: 1073751750
--- Block ID: 1073751753
AND
delta_0000014_0000014
               ---Block ID: 1073752044


Ie the blocks are not reused
Whether my understanding is correct?
Any pointers?
                             


--

Thanks & Regards


Unmesha Sreeveni U.B

Hadoop, Bigdata Developer
Centre for Cyber Security | Amrita Vishwa Vidyapeetham

http://www.unmeshasreeveni.blogspot.in/










--

Thanks & Regards


Unmesha Sreeveni U.B

Hadoop, Bigdata Developer
Centre for Cyber Security | Amrita Vishwa Vidyapeetham

http://www.unmeshasreeveni.blogspot.in/




Mime
View raw message