hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Haviv <daniel.ha...@veracity-group.com>
Subject Merging small files
Date Fri, 16 Oct 2015 07:48:44 GMT
Hi,
We are using Hive to merge small files by setting
hive.merge.smallfiles.avgsize to 120000000 and doing an insert as select to
a table.
The problem is that this take two passes over the data, first to insert the
data and then to merge it.

Is there a more efficient way to have Hive merge small files on the files
without running with two passes?


Thank you.
Daniel

Mime
View raw message