Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 71607 invoked from network); 20 Aug 2008 17:43:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Aug 2008 17:43:39 -0000 Received: (qmail 90133 invoked by uid 500); 20 Aug 2008 17:43:32 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 90100 invoked by uid 500); 20 Aug 2008 17:43:32 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 90089 invoked by uid 99); 20 Aug 2008 17:43:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Aug 2008 10:43:32 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of snickerdoodle08@gmail.com designates 209.85.217.13 as permitted sender) Received: from [209.85.217.13] (HELO mail-gx0-f13.google.com) (209.85.217.13) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Aug 2008 17:42:35 +0000 Received: by gxk6 with SMTP id 6so554259gxk.5 for ; Wed, 20 Aug 2008 10:43:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=cMiQODHA43GLGx0oNbDkxQWQ7FxZn/ucj/lj2Dhb5R4=; b=wvF0gWjuQbhff3aYZ34iUZWw5l0nh3gOtfaAWo79sFuyF1GdIQUIhTW4/QiDPQGfyX SrT+l/Aubpl5p3D7pRS5mlrU+3fcuUUIChYFPVO1jGdUGszdrcPpsrAf5yWlQdiPJLvx b1YnITv1kp5S43rPSxNdQEp/grw5XobmFRlaE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=JBiRHVDGDrn/IsyKv43g4K8l+aTZ19bJa1XQd6+a7gW1kacs7Ur6yuMhnOlYjlBCGQ HnxMg44Mnq5L/3joQfo7oA/AUfri6RA6xrDTZkKo964weXZEW4vLB/93ZS/UmeiNzBn7 IuL3GC6aBXAxD7ukUDuw5GKI9Xd9CPkhUAsi0= Received: by 10.151.107.8 with SMTP id j8mr462220ybm.163.1219254183145; Wed, 20 Aug 2008 10:43:03 -0700 (PDT) Received: by 10.150.134.2 with HTTP; Wed, 20 Aug 2008 10:43:02 -0700 (PDT) Message-ID: <257c70550808201043x2e645c1ajeb85f92a5594a9e9@mail.gmail.com> Date: Wed, 20 Aug 2008 12:43:02 -0500 From: Sandy To: core-user@hadoop.apache.org Subject: Re: pseudo-global variable constuction In-Reply-To: <48AB42D9.5010005@attributor.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_27150_7164446.1219254183001" References: <257c70550808191356u68f00821g655af38a694aea17@mail.gmail.com> <48AB42D9.5010005@attributor.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_27150_7164446.1219254183001 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Thank you very much, Paco and Jason. It works! For any users who may be curious what this may look like in code, here is a small snippet of mine: file: myLittleMRProgram.java package.org.apache.hadoop.examples; public static class Reduce extends MapReduceBase implements Reducer { private int nTax = 0; public void configure(JobConf job) { super.configure(job); String Tax = job.get("nTax"); nTax = Integer.parseInt(Tax); } public void reduce() throws IOException { .... System.out.println("nTax is: " + nTax); } .... main() { .... conf.set("nTax", other_args.get(2)); JobClient.runJob(conf); .... return 0; } -------- -SM On Tue, Aug 19, 2008 at 5:02 PM, Jason Venner wrote: > Since the map & reduce tasks generally run in a separate java virtual > machine and on distinct machines from your main task's java virtual machine, > there is no sharing of variables between the main task and the map or reduce > tasks. > > The standard way is to store the variable in the Configuration (or JobConf) > object in your main task > Then in the configure method of your map and reduce task class, extract the > variable value from the JobConf object. > > You will need to implement an overriding to the configure method in your > map and reduce classes. > > This will also require that the variable value be serializable. > > For lots of large variables this can be expensive. > > > Sandy wrote: > >> Hello, >> >> >> My M/R program is going smoothly, except for one small problem. I have a >> "global" variable that is set by the user (and thus in the main function), >> that I want one of my reduce functions to access. This is a read-only >> variable. After some reading in the forums, I tried something like this: >> >> file: MyGlobalVars.java >> package org.apache.hadoop.examples; >> public class MyGlobalVars { >> static public int nTax; >> } >> ------ >> >> file: myLittleMRProgram.java >> package.org.apache.hadoop.examples; >> map function() { >> System.out.println("in map function, nTax is: " + MyGlobalVars.nTax); >> } >> .... >> main() { >> MyGlobalVars.nTax = other_args.get(2); >> System.out.println("in main function, nTax is: " + MyGlobalVars.nTax); >> .... >> JobClient.runJob(conf); >> .... >> return 0; >> } >> -------- >> >> When I run it, I get: >> in main function, nTax is 20 (which is what I want) >> in map function, nTax is 0 (<--- this is not right). >> >> >> I am a little confused on how to resolve this. I apologize in advance if >> this is an blatant java error; I only began programming in the language a >> few weeks ago. >> >> Since Map Reduce tries to avoid the whole shared-memory scene, I am more >> than willing to have each reduce function receive a local copy of this >> user >> defined value. However, I am a little confused on what the best way to do >> this would be. As I see it, my options are: >> >> 1.) write the user defined value to the hdfs in the main function, and >> have >> it read from the hdfs in the reduce function. I can't quite figure out the >> code to this though. I know how to specify -an- input file for the map >> reduce task, but if I did it this way, won't I need to specify two >> separate >> input files? >> >> 2. Put it in the construction of the reduce object (I saw this mentioned >> in >> the archives). How would I accomplish this exactly when the value is user >> defined? Parameter Passing? If so, won't this require me changing the >> underlying map reduce base (which makes me a touch nervous, since i'm >> still >> very new to hadoop). >> >> What would be the easiest way to do this? >> >> Thanks in advance for the help. I appreciate your time. >> >> -SM >> >> >> > -- > Jason Venner > Attributor - Program the Web > Attributor is hiring Hadoop Wranglers and coding wizards, contact if > interested > ------=_Part_27150_7164446.1219254183001--