crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Ortiz <dpo5...@gmail.com>
Subject Re: Hadoop Configuration from DoFn
Date Tue, 13 Oct 2015 20:01:06 GMT
Micah,

      I have definitely had that approach work for passing configuration
memory settings in that were necessary for a specific DoFn, but would be
inappropriate for the rest of the Pipeline on CDH 5.4.4.  Not sure if there
is something that changes after that version which would prevent this from
working.

Thanks,
     Dave

On Tue, Oct 13, 2015 at 3:55 PM Micah Whitacre <mkwhitacre@gmail.com> wrote:

> Luke,
>   Generally that configuration should be set on the Configuration object
> passed to Pipeline vs on the individual DoFns.  The configure(...) method
> is called when re-instantiating the DoFn on the Map/Reduce task and at that
> point those memory settings wouldn't be honored.
>
> On Tue, Oct 13, 2015 at 2:52 PM, Luke Hansen <luke@wealthfront.com> wrote:
>
>> Does anyone know if this is the right way to configure Hadoop from a
>> Crunch DoFn?  This didn't seem to affect anything.
>>
>> Thanks!
>>
>> @Override
>> public void configure(Configuration conf) {
>>   conf.set("mapreduce.map.java.opts", "-Xmx3900m");
>>   conf.set("mapreduce.reduce.java.opts", "-Xmx3900m");
>>
>>   conf.set("mapreduce.map.memory.mb", "4096");
>>   conf.set("mapreduce.reduce.memory.mb", "4096");
>> }
>>
>>
>

Mime
View raw message