hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Despriee <>
Subject Re: Real-life experience of forcing smaller input splits?
Date Fri, 25 Jan 2013 07:44:26 GMT
Hi David,

What file format and compression type are you using ?


Le 25 janv. 2013 à 07:16, David Morel <> a écrit :

> Hello,
> I have seen many posts on various sites and MLs, but didn't find a firm
> answer anywhere: is it possible yes or no to force a smaller split size
> than a block on the mappers, from the client side? I'm not after
> pointers to the docs (unless you're very very sure :-) but after
> real-life experience along the lines of 'yes, it works this way, I've
> done it like this...'
> All the parameters that I could find (especially specifying a max input
> split size) seem to have no effect, and the files that I have are so
> heavily compressed that they completely saturate the mappers' memory
> when processed.
> A solution I could imagine for this specific issue is reducing the block
> size, but for now I simply went with disabling in-file compression for
> those. And changing the block size on a per-file basis is something I'd
> like to avoid if at all possible.
> All the hive settings that we tried only got me as far as raising the
> number of mappers from 5 to 6 (yay!) where I would have needed at least
> ten times more.
> Thanks!
> D.Morel

View raw message