hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradeep Gollakota <pradeep...@gmail.com>
Subject Re: HCatInputFormat combine splits
Date Thu, 14 May 2015 22:03:35 GMT
Still no effect. Set minsize to 32M and maxsize to 64M

On Thu, May 14, 2015 at 11:07 AM, Ankit Bhatnagar <ankitb@yahoo-inc.com>
wrote:

> try these
> mapred.max.split.size=
> mapred.min.split.size=
>
> mapreduce.input.fileinputformat.split.maxsize=
> mapreduce.input.fileinputformat.split.minsize=
>
>
>
>
>
>   On Thursday, May 14, 2015 11:04 AM, Pradeep Gollakota <
> pradeepg26@gmail.com> wrote:
>
>
> The following property has been to no effect.
>
> mapreduce.input.fileinputformat.split.maxsize = 67108864
>
> I'm still getting 1 Mapper per file.
>
> On Thu, May 14, 2015 at 10:27 AM, Ankit Bhatnagar <ankitb@yahoo-inc.com>
> wrote:
>
> you can explicitly set the split size
>
>
>
>   On Wednesday, May 13, 2015 11:37 PM, Pradeep Gollakota <
> pradeepg26@gmail.com> wrote:
>
>
> Hi All,
>
> I'm writing an MR job to read data using HCatInputFormat... however, the
> job is generating too many splits. I don't have this problem when running
> queries in Hive since it combines splits by default.
>
> Is there an equivalent in MR so that I'm not generating thousands of
> mappers?
>
> Thanks,
> Pradeep
>
>
>
>
>
>

Mime
View raw message