hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sammy Yu <...@brightedge.com>
Subject Merging small files with dynamic partitions
Date Fri, 15 Oct 2010 20:43:08 GMT
Hi,
  I have a dynamic partition query which generates quite a few small
files which I would like to merge:

SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.exec.dynamic.partition=true;
SET hive.exec.compress.output=true;
SET io.seqfile.compression.type=BLOCK;
SET hive.merge.size.per.task=256000000;
SET hive.merge.smallfiles.avgsize=16000000000;
SET hive.merge.mapfiles=true;
SET hive.merge.mapredfiles=true;
SET hive.mergejob.maponly=true;
INSERT OVERWRITE TABLE daily_conversions_without_rank_all_table
PARTITION(org_id, day)
SELECT session_id, permanent_id, first_date, last_date, week, month, quarter,
referral_type, search_engine, us_search_engine,
keyword, unnormalized_keyword, branded, conversion_meet, goals_meet,
pages_viewed,
entry_page, page_types,
org_id, day
FROM daily_conversions_without_rank_table;

I am running the latest version from trunk with HIVE-1622, but it
seems like I just can't get the post merge process to happen. I have
raised hive.merge.smallfiles.avgsize.  I'm wondering if the filtering
at runtime is causing the merge process to be skipped.  Attached are
the hive output and log files.


Thanks,
Sammy

Mime
View raw message