hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Tucker <matthewt...@gmail.com>
Subject Re: split into less files
Date Wed, 09 Nov 2011 03:05:36 GMT
It sounds like you want to look at setting hive.merge.mapredfiles to true in your hive-site.xml.

Just be aware that it will likely add another map step to your jobs to consolidate the files.

Matt Tucker



On Nov 8, 2011, at 6:19 PM, Shouguo Li <the1plummie@gmail.com> wrote:

> i think that has to do with your configured block size, check what's your value for dfs.block.size
in /hdfs-site.xml    
> but just curious, why would number of files matter for your use case?
> 
> 
> On Fri, Oct 21, 2011 at 1:18 AM, Vikas Srivastava <vikas.srivastava@one97.net>
wrote:
> Hey All,
> 
> 
> i have an issue like i got a table having single partition but in that partition say
around 100 200mb files  when i overwrite this into other table its make 100 files of 20 mb(compressed)
what i want is that it should make only 1 or 2 or 10 file of 200mb or 100mb
> 
> 
> means after overwrite its should make less no of file as compare to non compressed. 
> 
> 
> 
> 
> -- 
> With Regards
> Vikas Srivastava
> 
> DWH & Analytics Team
> Mob:+91 9560885900
> One97 | Let's get talking !
> 
> 

Mime
View raw message