giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charith Wickramarachchi <charith.dhanus...@gmail.com>
Subject Re: Giraph Job Get Killed suddenly
Date Tue, 11 Nov 2014 19:42:39 GMT
Thanks for the quick replies.

I did some digging into the logs. It seems like it's due to a GC Overhead
limit exceeded Exception. I think it might be due to some unnessory overheads
in my implementation.  I will  optimize my code to avoid this.

2014-11-11 10:34:22,482 INFO [main] org.apache.giraph.graph.GraphTaskManager:
execute: 8 partitions to process with 1 compute thread(s), originally 1
thread(s) on superstep 0
2014-11-11 10:34:38,266 WARN [netty-client-exec-0]
io.netty.util.concurrent.SingleThreadEventExecutor: Unexpected exception
from an event executor:
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.concurrent.locks.
AbstractQueuedSynchronizer$ConditionObject.addConditionWaiter(
AbstractQueuedSynchronizer.java:1857)
        at java.util.concurrent.locks.
AbstractQueuedSynchronizer$ConditionObject.awaitNanos(
AbstractQueuedSynchronizer.java:2073)
        at java.util.concurrent.LinkedBlockingQueue.poll(
LinkedBlockingQueue.java:467)
        at
io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219)
        at
io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34)
        at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
        at java.lang.Thread.run(Thread.java:722)


I tried setting mapred.child.java.opts option but then job failed giving
following error.

2014-11-11 11:04:45,780 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
Diagnostics report from attempt_1415143619219_0009_m_000004_0:
Container [pid=28984,containerID=container_1415143619219_0009_01_000006]
is running beyond virtual memory limits. Current usage: 156.3 MB of 1
GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing
container.


Thanks,

Charith







:



On Tue, Nov 11, 2014 at 10:57 AM, Unmesh Joshi <unmesh.joshi126@gmail.com>
wrote:

> Try increasing the memory with -Xmx1024m option.
> 1024 can be replaced with the memory availability and choice. This should
> be set to  mapred.child.java.opts
>
>
>
>
>    Regards,
>    Unmesh Joshi
>
>
> On 11 November 2014 10:26, Charith Wickramarachchi <
> charith.dhanushka@gmail.com> wrote:
>
> > Hi Devs,
> >
> > I am sending this mail to the dev list since I think Giraph developers
> > might have experienced the issue I am facing.
> >
> > I am working on extending graph to support a programming model somewhat
> > similar to giraph++. I got an initial POC version running with in my
> local
> > machine in a pseudo distributed mode. But when I run with large graphs in
> > a cluster, suddenly the map reduce job get killed.
> >
> > This is because, suddenly the job receives a kill signal. I am still not
> > sure about what's the root cause.  My hunch is that it has something to
> > do with progress reporting from mappers. I am attaching part of the log
> > that might be helpful.
> >
> > It will be great if you can give me some insights based on your
> > experience.
> >
> > Giraph Version: 1.1.0
> > Hadoop version: 2.2.0
> > Application Type: Map Reduce
> >
> > Thanks,
> > Charith
> >
> > --
> > Charith Dhanushka Wickramaarachchi
> >
> > Tel  +1 213 447 4253
> > Web  http://apache.org/~charith <http://www-scf.usc.edu/~cwickram/>
> > <http://charith.wickramaarachchi.org/>
> > Blog  http://charith.wickramaarachchi.org/
> > <http://charithwiki.blogspot.com/>
> > Twitter  @charithwiki <https://twitter.com/charithwiki>
> >
> > This communication may contain privileged or other confidential
> information
> > and is intended exclusively for the addressee/s. If you are not the
> > intended recipient/s, or believe that you may have
> > received this communication in error, please reply to the sender
> indicating
> > that fact and delete the copy you received and in addition, you should
> > not print, copy, retransmit, disseminate, or otherwise use the
> > information contained in this communication. Internet communications
> > cannot be guaranteed to be timely, secure, error or virus-free. The
> > sender does not accept liability for any errors or omissions
> >
>



-- 
Charith Dhanushka Wickramaarachchi

Tel  +1 213 447 4253
Web  http://apache.org/~charith <http://www-scf.usc.edu/~cwickram/>
<http://charith.wickramaarachchi.org/>
Blog  http://charith.wickramaarachchi.org/
<http://charithwiki.blogspot.com/>
Twitter  @charithwiki <https://twitter.com/charithwiki>

This communication may contain privileged or other confidential information
and is intended exclusively for the addressee/s. If you are not the
intended recipient/s, or believe that you may have
received this communication in error, please reply to the sender indicating
that fact and delete the copy you received and in addition, you should not
print, copy, retransmit, disseminate, or otherwise use the information
contained in this communication. Internet communications cannot be
guaranteed to be timely, secure, error or virus-free. The sender does not
accept liability for any errors or omissions

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message