hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Kozlov <ale...@cloudera.com>
Subject Re: java.lang.OutOfMemoryError: Java heap space
Date Mon, 12 Jul 2010 20:01:38 GMT
Hi Shuja,

Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
command line, the last is valid.  Unfortunately you have truncated command
lines.  Can you show us the full command line, particularly for the process
26162?  This seems to be causing problems.

If you are running your cluster on 2 nodes, it may be that the task was
scheduled on the second node.  Did you run "ps -aef" on the second node as
well?  You can see the task assignment in the JT web-UI (
http://jt-name:50030, drill down to tasks).

I suggest you first debug your program in the local mode first, however (use
"*-jt local*" option).  Did you try the "*-D mapred.child.ulimit=3145728*"
option?  I do not see it on the command line.

Alex K

On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <shujamughal@gmail.com>wrote:

> Hi Alex
>
> I have tried with using quotes  and also with -jt local but same heap
> error.
> and here is the output  of ps -aef
>
> UID        PID  PPID  C STIME TTY          TIME CMD
> root         1     0  0 04:37 ?        00:00:00 init [3]
> root         2     1  0 04:37 ?        00:00:00 [migration/0]
> root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> root         5     1  0 04:37 ?        00:00:00 [events/0]
> root         6     1  0 04:37 ?        00:00:00 [khelper]
> root         7     1  0 04:37 ?        00:00:00 [kthread]
> root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> root        10     7  0 04:37 ?        00:00:00 [xenbus]
> root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> root        22     7  0 04:37 ?        00:00:00 [khubd]
> root        24     7  0 04:37 ?        00:00:00 [kseriod]
> root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> root        85     7  0 04:37 ?        00:00:00 [pdflush]
> root        86     7  0 04:37 ?        00:00:00 [pdflush]
> root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> root        88     7  0 04:37 ?        00:00:00 [aio/0]
> root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> root       248     7  0 04:37 ?        00:00:00 [kstriped]
> root       257     7  0 04:37 ?        00:00:00 [kjournald]
> root       279     7  0 04:37 ?        00:00:00 [kauditd]
> root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
> root       660     7  0 04:37 ?        00:00:00 [kjournald]
> root       662     7  0 04:37 ?        00:00:00 [kjournald]
> root      1032     1  0 04:38 ?        00:00:00 auditd
> root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> root      1052     1  0 04:38 ?        00:00:00 klogd -x
> root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon --system
> root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> root      1244     1  0 04:38 ?        00:00:00 pcscd
> root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd --server
> root      1295     1  0 04:38 ?        00:00:00 automount
> root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive -pidfile
> /var/run/xinetd.pid
> root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
> /etc/vsftpd/vsftpd.conf
> root      1354     1  0 04:38 ?        00:00:00 sendmail: accepting
> connections
> smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue runner@01
> :00:00
> for /var/spool/clientmqueue
> root      1379     1  0 04:38 ?        00:00:00 gpm -m /dev/input/mice -t
> exps2
> root      1410     1  0 04:38 ?        00:00:00 crond
> xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv -daemon
> root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> 68        1508     1  0 04:38 ?        00:00:00 hald
> root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd -q never
> root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0 9600
> vt100-nav
> root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python -tt
> /usr/sbin/yum-updatesd
> root      1539     1  0 04:38 ?        00:00:00 /usr/libexec/gam_server
> root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
> root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
> root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
> root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
> root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> hadoop   24808     1  0 12:01 ?        00:00:02 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   24893     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   24988     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   25085     1  0 12:01 ?        00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   25175     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> -Dhadoop.log.file=hadoo
> root     25925 21994  1 12:06 pts/2    00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> -Dhadoop.log.file=hadoop.log -
> hadoop   26120 25175 14 12:06 ?        00:00:01
> /usr/jdk1.6.0_03/jre/bin/java
>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
> -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> -Dscript.name=/usr/local/groovy/b
> root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
>
>
> *The command which i am executing is *
>
>
> hadoop jar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> -D mapred.child.java.opts=-Xmx1024m \
> -inputformat StreamInputFormat \
> -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> \
> -input
>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> \
> -jobconf mapred.map.tasks=1 \
> -jobconf mapred.reduce.tasks=0 \
> -output RNC25 \
> -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> -file /home/ftpuser1/Nodemapper5.groovy \
> -jt local
>
> I have noticed that the all hadoop processes showing 2001 memory size which
> i have set in hadoop-env.sh. and one the command, i give 2000 in mapper and
> 1024 in child.java.opts but i think these values(1024,2001) are not in use.
> secondly the following lines
>
> *hadoop   26120 25175 14 12:06 ?        00:00:01
> /usr/jdk1.6.0_03/jre/bin/java
>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
> -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> -Dscript.name=/usr/local/groovy/b*
>
> did not appear for first time when job runs. they appear when job failed
> for
> first time and then again try to start mapping. I have one more question
> which is as all hadoop processes (namenode, datanode, tasktracker...)
> showing 2001 heapsize in process. will it means  all the processes using
> 2001m of memory??
>
> Regards
> Shuja
>
>
> On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <alexvk@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > I think you need to enclose the invocation string in quotes.  Try:
> >
> > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> >
> > Also, it would be nice to see how exactly the groovy is invoked.  Is
> groovy
> > started and them gives you OOM or is OOM error during the start?  Can you
> > see the new process with "ps -aef"?
> >
> > Can you run groovy in local mode?  Try "-jt local" option.
> >
> > Thanks,
> >
> > Alex K
> >
> > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <shujamughal@gmail.com>
> > wrote:
> >
> > > Hi Patrick,
> > > Thanks for explanation. I have supply the heapsize in mapper in the
> > > following way
> > >
> > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > >
> > > but still same error. Any other idea?
> > > Thanks
> > >
> > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com
> > > >wrote:
> > >
> > > > Shuja,
> > > >
> > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are
> only
> > > > used
> > > > for child JVMs that get forked by the TaskTracker. You are using
> Hadoop
> > > > streaming, which means the TaskTracker is forking a JVM for
> streaming,
> > > > which
> > > > is then forking a shell process that runs your groovy code (in
> another
> > > > JVM).
> > > >
> > > > I'm not much of a groovy expert, but if there's a way you can wrap
> your
> > > > code
> > > > around the MapReduce API that would work best. Otherwise, you can
> just
> > > pass
> > > > the heapsize in '-mapper' argument.
> > > >
> > > > Regards,
> > > >
> > > > - Patrick
> > > >
> > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <shujamughal@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Alex,
> > > > >
> > > > > I have update the java to latest available version on all machines
> in
> > > the
> > > > > cluster and now i run the job by adding this line
> > > > >
> > > > > -D mapred.child.ulimit=3145728 \
> > > > >
> > > > > but still same error. Here is the output of this job.
> > > > >
> > > > >
> > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -Dha
> doop.home.dir=/usr/lib/hadoop-0.20
> > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > /usr/lib/hadoop-0.20/con
> > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > org.apache.hadoop.util.RunJar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > -D
> > > > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > > > -inputformat StreamIn putFormat -inputreader
> > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> mapred.map.tasks=1
> > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > /home/ftpuser1/Nodemapp
> > > > > er5.groovy
> > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > Nodemapper5.groovy
> > > > >
> > > > >
> > > > > Any clue?
> > > > > Thanks
> > > > >
> > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <alexvk@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > First, thank you for using CDH3.  Can you also check what m*
> > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > >
> > > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at
a
> > > > minimum,
> > > > > > which you can download from the Java SE
> > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > .
> > > > > >
> > > > > > Let me know how it goes.
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi Alex
> > > > > > >
> > > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > > Cloudera
> > > > > > > distribution of hadoop. and here is the output of this
command.
> > > > > > >
> > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > > > > -Dhadoop.log.file=hadoop.log
> > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > StreamInputFormat
> > > > > > > -inputreader StreamXmlRecordReader,begin=         <mdc
> > xmlns:HTML="
> > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > mapred.map.tasks=1
> > > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > > Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > alexvk@cloudera.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuja,
> > > > > > > >
> > > > > > > > It looks like the OOM is happening in your code. 
Are you
> > running
> > > > > > > MapReduce
> > > > > > > > in a cluster?  If so, can you send the exact command
line
> your
> > > code
> > > > > is
> > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > Nodemapper5.groovy'
> > > > > > > > command on one of the nodes which is running the task?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Alex K
> > > > > > > >
> > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Hi All
> > > > > > > > >
> > > > > > > > > I am facing a hard problem. I am running a map
reduce job
> > using
> > > > > > > streaming
> > > > > > > > > but it fails and it gives the following error.
> > > > > > > > >
> > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
space
> > > > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > >
> > > > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > > > subprocess
> > > > > > > > > failed with code 1
> > > > > > > > >        at
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > >
> > > > > > > > >        at
> > > > > > > >
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > >        at
> > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > >        at
> > > > > > > > >
> > > > >
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > >        at
> > > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > >
> > > > > > > > >        at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > >        at
> org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I have increased the heap size in hadoop-env.sh
and make it
> > > > 2000M.
> > > > > > Also
> > > > > > > I
> > > > > > > > > tell the job manually by following line.
> > > > > > > > >
> > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > >
> > > > > > > > > but it still gives the error. The same job runs
fine if i
> run
> > > on
> > > > > > shell
> > > > > > > > > using
> > > > > > > > > 1024M heap size like
> > > > > > > > >
> > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Any clue?????????
> > > > > > > > >
> > > > > > > > > Thanks in advance.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message