crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Juhn <benjij...@gmail.com>
Subject Re: Processing splittable inputs
Date Fri, 26 Feb 2016 23:17:31 GMT
The data isn’t compressed.  The parameters aren’t showing up in the job configuration either.


> On Feb 25, 2016, at 5:15 PM, Ben Juhn <benjijuhn@gmail.com> wrote:
> 
> Hello there,
> 
> I haven’t been able to get crunch to split inputs into multiple mappers.  Currently
it’s giving me one mapper per text file, even though they’re 1GB each.  I’ve tried supplying
split.maxsize on the command line and in the DoFn implementation: 
> 
> @Override
> public void configure(Configuration conf) {
> conf.set("crunch.combine.file.size", "67108864");
> conf.set("mapreduce.input.fileinputformat.split.maxsize", "67108864");
> conf.set("mapreduce.input.fileinputformat.split.minsize", "67108864");
> }
> 
> Any suggestions?
> 
> Thanks,
> Ben
> 


Mime
View raw message