hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ning Zhang <>
Subject Re: Small file problem and GenMRFileSink1
Date Thu, 30 Jun 2011 22:46:41 GMT
If you are using hive trunk and your table is stored in RCFile format, you can run 

alter table src_rc_merge_test concatenate;

On Jun 30, 2011, at 9:53 AM, David Ginzburg wrote:

> Hi,
> I'm not sure weather this belongs in the hive-dev or hive-user.
> I have a folder with many small files.
> I would like to reduce the number of files the way hive merges output .
> I tried to understand from the source of org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1
how to leverage the API to submit a job 
> that merges output files.
> I think I was able to identify:  
> private void createMergeJob(FileSinkOperator fsOp, GenMRProcContext ctx, String finalName)
> throws SemanticException 
> As the entry point to the logic that performs the operation, but I did not find documentation
as to how to use it
> Is there an example that simulates the use of this API call?

View raw message