Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
Date: Wed, 13 Oct 2010 14:26:14 -0700
From: Konstantin Boudnik <cos@boudnik.org>
To: common-user@hadoop.apache.org
Subject: Re: load a serialized object in hadoop
Message-ID: <20101013212614.GC10266@tp>
References: <4CB3787B.2050403@uchicago.edu>
 <AANLkTimWKwhpQ-ur+ChVJ6YcNNadszk5ebz==5tkp-+J@mail.gmail.com>
 <AANLkTikq0W4SbbL=9EEm8PnmgRPEMPNk5ynALWJww7OA@mail.gmail.com>
 <4CB5CA6B.7050603@uchicago.edu>
 <AANLkTikqN24NqKgNYCkzLKTemX-VxoMSNv1aWie0-1jA@mail.gmail.com>
 <4CB5FE31.8090103@uchicago.edu>
 <AANLkTi=NOMh-Yfa2aUsaBdkdy_MRpZ6GLa5nfwSFWT_H@mail.gmail.com>
 <4CB6081A.1020605@uchicago.edu>
 <AANLkTik=1CL5FK7tS2yH2gbq1ajp6X3Rz7P=QmZTCDiv@mail.gmail.com>
 <4CB622F1.3070107@uchicago.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <4CB622F1.3070107@uchicago.edu>
User-Agent: Mutt/1.5.20 (2009-06-14)

You should have no space here "-D HADOOP_CLIENT_OPTS"

On Wed, Oct 13, 2010 at 04:21PM, Shi Yu wrote:
> Hi,  thanks for the advice. I tried with your settings,
> $ bin/hadoop jar Test.jar OOloadtest -D HADOOP_CLIENT_OPTS=-Xmx4000m
> 
> still no effect. Or this is a system variable? Should I export it?
> How to configure it?
> 
> Shi
> 
>  java -Xms3G -Xmx3G -classpath .:WordCount.jar:hadoop-0.19.2-core.jar:lib/log4j-1.2.15.jar:lib/commons-collections-3.2.1.jar:lib/stanford-postagger-2010-05-26.jar
> OOloadtest
> 
> 
> On 2010-10-13 15:28, Luke Lu wrote:
> >On Wed, Oct 13, 2010 at 12:27 PM, Shi Yu<shiyu@uchicago.edu>  wrote:
> >>I haven't implemented anything in map/reduce yet for this issue. I just try
> >>to invoke the same java class using   bin/hadoop  command.  The thing is a
> >>very simple program could be executed in Java, but not doable in bin/hadoop
> >>command.
> >If you are just trying to use bin/hadoop jar your.jar command, your
> >code runs in a local client jvm and mapred.child.java.opts has no
> >effect. You should run it with HADOOP_CLIENT_OPTS=-Xmx1000m bin/hadoop
> >jar your.jar
> >
> >>I think if I couldn't get through the first stage, even I had a
> >>map/reduce program it would also fail. I am using Hadoop 0.19.2. Thanks.
> >>
> >>Best Regards,
> >>
> >>Shi
> >>
> >>On 2010-10-13 14:15, Luke Lu wrote:
> >>>Can you post your mapper/reducer implementation? or are you using
> >>>hadoop streaming? for which mapred.child.java.opts doesn't apply to
> >>>the jvm you care about. BTW, what's the hadoop version you're using?
> >>>
> >>>On Wed, Oct 13, 2010 at 11:45 AM, Shi Yu<shiyu@uchicago.edu>    wrote:
> >>>
> >>>>Here is my code. There is no Map/Reduce in it. I could run this code
> >>>>using
> >>>>java -Xmx1000m ,  however, when using  bin/hadoop  -D
> >>>>mapred.child.java.opts=-Xmx3000M   it has heap space not enough error.  I
> >>>>have tried other program in Hadoop with the same settings so the memory
> >>>>is
> >>>>available in my machines.
> >>>>
> >>>>
> >>>>public static void main(String[] args) {
> >>>>   try{
> >>>>             String myFile = "xxx.dat";
> >>>>             FileInputStream fin = new FileInputStream(myFile);
> >>>>             ois = new ObjectInputStream(fin);
> >>>>             margintagMap = ois.readObject();
> >>>>             ois.close();
> >>>>             fin.close();
> >>>>     }catch(Exception e){
> >>>>         //
> >>>>    }
> >>>>}
> >>>>
> >>>>On 2010-10-13 13:30, Luke Lu wrote:
> >>>>
> >>>>>On Wed, Oct 13, 2010 at 8:04 AM, Shi Yu<shiyu@uchicago.edu>      wrote:
> >>>>>
> >>>>>
> >>>>>>As a coming-up to the my own question, I think to invoke the JVM in
> >>>>>>Hadoop
> >>>>>>requires much more memory than an ordinary JVM.
> >>>>>>
> >>>>>>
> >>>>>That's simply not true. The default mapreduce task Xmx is 200M, which
> >>>>>is much smaller than the standard jvm default 512M and most users
> >>>>>don't need to increase it. Please post the code reading the object (in
> >>>>>hdfs?) in your tasks.
> >>>>>
> >>>>>
> >>>>>
> >>>>>>I found that instead of
> >>>>>>serialization the object, maybe I could create a MapFile as an index to
> >>>>>>permit lookups by key in Hadoop. I have also compared the performance
> >>>>>>of
> >>>>>>MongoDB and Memcache. I will let you know the result after I try the
> >>>>>>MapFile
> >>>>>>approach.
> >>>>>>
> >>>>>>Shi
> >>>>>>
> >>>>>>On 2010-10-12 21:59, M. C. Srivas wrote:
> >>>>>>
> >>>>>>
> >>>>>>>>On Tue, Oct 12, 2010 at 4:50 AM, Shi Yu<shiyu@uchicago.edu>
> >>>>>>>>  wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>Hi,
> >>>>>>>>>
> >>>>>>>>>I want to load a serialized HashMap object in hadoop. The file of
> >>>>>>>>>stored
> >>>>>>>>>object is 200M. I could read that object efficiently in JAVA by
> >>>>>>>>>setting
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>-Xmx
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>as 1000M.  However, in hadoop I could never load it into memory. The
> >>>>>>>>>code
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>is
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>very simple (just read the ObjectInputStream) and there is yet no
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>map/reduce
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>implemented.  I set the  mapred.child.java.opts=-Xmx3000M, still get
> >>>>>>>>>the
> >>>>>>>>>"java.lang.OutOfMemoryError: Java heap space"  Could anyone explain
> >>>>>>>>>a
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>little
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>bit how memory is allocate to JVM in hadoop. Why hadoop takes up so
> >>>>>>>>>much
> >>>>>>>>>memory?  If a program requires 1G memory on a single node, how much
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>memory
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>it requires (generally) in Hadoop?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>The JVM reserves swap space in advance, at the time of launching the
> >>>>>>>process. If your swap is too low (or do not have any swap configured),
> >>>>>>>you
> >>>>>>>will hit this.
> >>>>>>>
> >>>>>>>Or, you are on a 32-bit machine, in which case 3G is not possible in
> >>>>>>>the
> >>>>>>>JVM.
> >>>>>>>
> >>>>>>>-Srivas.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>>Thanks.
> >>>>>>>>>
> >>>>>>>>>Shi
> >>>>>>>>>
> >>>>>>>>>--
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>--
> >>>>Postdoctoral Scholar
> >>>>Institute for Genomics and Systems Biology
> >>>>Department of Medicine, the University of Chicago
> >>>>Knapp Center for Biomedical Discovery
> >>>>900 E. 57th St. Room 10148
> >>>>Chicago, IL 60637, US
> >>>>Tel: 773-702-6799
> >>>>
> >>>>
> >>>>
> >>
> >>--
> >>Postdoctoral Scholar
> >>Institute for Genomics and Systems Biology
> >>Department of Medicine, the University of Chicago
> >>Knapp Center for Biomedical Discovery
> >>900 E. 57th St. Room 10148
> >>Chicago, IL 60637, US
> >>Tel: 773-702-6799
> >>
> >>
> 
>