hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Haviv <>
Subject Merging small files
Date Fri, 16 Oct 2015 07:48:44 GMT
We are using Hive to merge small files by setting
hive.merge.smallfiles.avgsize to 120000000 and doing an insert as select to
a table.
The problem is that this take two passes over the data, first to insert the
data and then to merge it.

Is there a more efficient way to have Hive merge small files on the files
without running with two passes?

Thank you.

View raw message