hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shuja Rehman <shujamug...@gmail.com>
Subject Re: java.lang.OutOfMemoryError: Java heap space
Date Mon, 12 Jul 2010 13:29:58 GMT
Hi Patrick,
Thanks for explanation. I have supply the heapsize in mapper in the
following way

-mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \

but still same error. Any other idea?
Thanks

On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com>wrote:

> Shuja,
>
> Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> used
> for child JVMs that get forked by the TaskTracker. You are using Hadoop
> streaming, which means the TaskTracker is forking a JVM for streaming,
> which
> is then forking a shell process that runs your groovy code (in another
> JVM).
>
> I'm not much of a groovy expert, but if there's a way you can wrap your
> code
> around the MapReduce API that would work best. Otherwise, you can just pass
> the heapsize in '-mapper' argument.
>
> Regards,
>
> - Patrick
>
> On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <shujamughal@gmail.com>
> wrote:
>
> > Hi Alex,
> >
> > I have update the java to latest available version on all machines in the
> > cluster and now i run the job by adding this line
> >
> > -D mapred.child.ulimit=3145728 \
> >
> > but still same error. Here is the output of this job.
> >
> >
> > root      7845  5674  3 01:24 pts/1    00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> /usr/lib/hadoop-0.20/con
> >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > org.apache.hadoop.util.RunJar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D
> > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > -inputformat StreamIn putFormat -inputreader
> > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > 3.org/TR/REC-xml">,end=</mdc>
> > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > org.apache.hadoop.mapred.lib.IdentityReducer -file
> /home/ftpuser1/Nodemapp
> > er5.groovy
> > root      7930  7632  0 01:24 pts/2    00:00:00 grep Nodemapper5.groovy
> >
> >
> > Any clue?
> > Thanks
> >
> > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <alexvk@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > First, thank you for using CDH3.  Can you also check what m*
> > > apred.child.ulimit* you are using?  Try adding "*
> > > -D mapred.child.ulimit=3145728*" to the command line.
> > >
> > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> minimum,
> > > which you can download from the Java SE
> > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > .
> > >
> > > Let me know how it goes.
> > >
> > > Alex K
> > >
> > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamughal@gmail.com
> > > >wrote:
> > >
> > > > Hi Alex
> > > >
> > > > Yeah, I am running a job on cluster of 2 machines and using Cloudera
> > > > distribution of hadoop. and here is the output of this command.
> > > >
> > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > > -inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
> > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > home/ftpuser1/Nodemapper5.groovy
> > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> Nodemapper5.groovy
> > > >
> > > >
> > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > and what is meant by OOM and thanks for helping,
> > > >
> > > > Best Regards
> > > >
> > > >
> > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <alexvk@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > It looks like the OOM is happening in your code.  Are you running
> > > > MapReduce
> > > > > in a cluster?  If so, can you send the exact command line your code
> > is
> > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > Nodemapper5.groovy'
> > > > > command on one of the nodes which is running the task?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > shujamughal@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi All
> > > > > >
> > > > > > I am facing a hard problem. I am running a map reduce job using
> > > > streaming
> > > > > > but it fails and it gives the following error.
> > > > > >
> > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > >
> > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > subprocess
> > > > > > failed with code 1
> > > > > >        at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > >        at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > >
> > > > > >        at
> > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > >        at
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > >        at
> > > > > >
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > >        at
> > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > >
> > > > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > >
> > > > > >
> > > > > > I have increased the heap size in hadoop-env.sh and make it
> 2000M.
> > > Also
> > > > I
> > > > > > tell the job manually by following line.
> > > > > >
> > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > >
> > > > > > but it still gives the error. The same job runs fine if i run
on
> > > shell
> > > > > > using
> > > > > > 1024M heap size like
> > > > > >
> > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > > Any clue?????????
> > > > > >
> > > > > > Thanks in advance.
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message