hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <>
Subject bucketed table problems
Date Fri, 07 Mar 2014 19:46:47 GMT
I want to convert a table to a bucketed table, so I made a new table with the same schema as
the old table and specified a cluster column:

create table foo_bucketed
a string,
b int,
c float
clustered by (b) into 10 buckets;

Then I populate it from my original table:

set hive.enforce.bucketing = true;
insert overwrite table foo_bucketed
select * from foo;

All of the data goes into the first bucket, leaving the remaining 9 buckets empty (in the
file system, the remaining 9 files are 0 size).  Furthermore, the cluster column is now NULL.
 Its values have been completely erased by the insertion (which might explain how they all
ended up in a single bucket of course).

Keith Wiley

"Yet mark his perfect self-contentment, and hence learn his lesson, that to be
self-contented is to be vile and ignorant, and that to aspire is better than to
be blindly and impotently happy."
                                           --  Edwin A. Abbott, Flatland

View raw message