crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <mkwhita...@gmail.com>
Subject Re: Hadoop Configuration from DoFn
Date Tue, 13 Oct 2015 20:08:07 GMT
Yeah was misconstruing it with the setContext(...) method which provides
the configuration when the job is actually running.[1]  Luke, you might
look at generating a plan of your pipeline to see what other DoFns might be
inside the same job and causing a conflict with your settings.

We typically do the global settings vs trying to tweak at each DoFn simply
because it allows us to avoid worrying about which DoFn's get grouped into
a single task and override each other.

[1] -
http://crunch.apache.org/apidocs/0.12.0/org/apache/crunch/DoFn.html#configure(org.apache.hadoop.conf.Configuration)

On Tue, Oct 13, 2015 at 3:02 PM, Robinson, Landon - Landon <
landon.t.robinson@lowes.com> wrote:

> You can do it both ways: at the DoFn level or at the pipeline level.
>
> For global settings, go with the pipeline level. For individual
> jobs/tasks, go DoFn Level.
>
> *Pipeline Level:*
>
> Configuration crunchConf = getConf();
> crunchConf.set("mapred.job.queue.name", "batch");
> Pipeline pipeline = new MRPipeline(TransformKronosMR.class, *“*My Pipeline" ,crunchConf);
>
>
> *DoFn Level (as mentioned):*
>
> @Override
> public void configure(Configuration conf) {
>   conf.set("mapreduce.map.java.opts", "-Xmx3900m");
>   conf.set("mapreduce.reduce.java.opts", "-Xmx3900m");
>
>   conf.set("mapreduce.map.memory.mb", "4096");
>   conf.set("mapreduce.reduce.memory.mb", "4096");
> }
>
>
>
> ---------------------------------------------------------------------------
> Landon Robinson
> Big Data/Hadoop Engineer
> Lowe’s Companies Inc. | IT Business Intelligence
> ---------------------------------------------------------------------------
>
> From: Micah Whitacre <mkwhitacre@gmail.com>
> Reply-To: "user@crunch.apache.org" <user@crunch.apache.org>
> Date: Tuesday, October 13, 2015 at 3:55 PM
> To: "user@crunch.apache.org" <user@crunch.apache.org>
> Subject: Re: Hadoop Configuration from DoFn
>
> Luke,
>   Generally that configuration should be set on the Configuration object
> passed to Pipeline vs on the individual DoFns.  The configure(...) method
> is called when re-instantiating the DoFn on the Map/Reduce task and at that
> point those memory settings wouldn't be honored.
>
> On Tue, Oct 13, 2015 at 2:52 PM, Luke Hansen <luke@wealthfront.com> wrote:
>
>> Does anyone know if this is the right way to configure Hadoop from a
>> Crunch DoFn?  This didn't seem to affect anything.
>>
>> Thanks!
>>
>> @Override
>> public void configure(Configuration conf) {
>>   conf.set("mapreduce.map.java.opts", "-Xmx3900m");
>>   conf.set("mapreduce.reduce.java.opts", "-Xmx3900m");
>>
>>   conf.set("mapreduce.map.memory.mb", "4096");
>>   conf.set("mapreduce.reduce.memory.mb", "4096");
>> }
>>
>>
> NOTICE: All information in and attached to the e-mails below may be
> proprietary, confidential, privileged and otherwise protected from improper
> or erroneous disclosure. If you are not the sender's intended recipient,
> you are not authorized to intercept, read, print, retain, copy, forward, or
> disseminate this message. If you have erroneously received this
> communication, please notify the sender immediately by phone (704-758-1000)
> or by e-mail and destroy all copies of this message electronic, paper, or
> otherwise.
>
> *By transmitting documents via this email: Users, Customers, Suppliers and
> Vendors collectively acknowledge and agree the transmittal of information
> via email is voluntary, is offered as a convenience, and is not a secured
> method of communication; Not to transmit any payment information E.G.
> credit card, debit card, checking account, wire transfer information,
> passwords, or sensitive and personal information E.G. Driver's license,
> DOB, social security, or any other information the user wishes to remain
> confidential; To transmit only non-confidential information such as plans,
> pictures and drawings and to assume all risk and liability for and
> indemnify Lowe's from any claims, losses or damages that may arise from the
> transmittal of documents or including non-confidential information in the
> body of an email transmittal. Thank you. *
>

Mime
View raw message