hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sukhendu Chakraborty <sukhendu.chakrabo...@gmail.com>
Subject appending data to clustered tables.
Date Wed, 19 Feb 2014 23:45:29 GMT
Hi,

Is there a way to add data into a bucketed/clustered table in hive-0.11. I
have a clustered table with 32 buckets (no partitions) with some data, can
I append more data by running a "insert into <table>...."? From
http://osdir.com/ml/hive-user-hadoop-apache/2009-03/msg00094.html it looks
like the feature is not supported till 2009.
When I tried experimenting with it in hive-0.11, I saw after the second
insert, a new set of 32 files were created with '000000_*.copy' notation.
So, we had 64 files instead of original 32. Is this an expected behavior
and hive knows how to merge the 64 files into 32 for each bucket before
processing? How about sorted bucketed tables?

Thanks,
-Sukhendu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message