giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: more concurrency issues with requests management?
Date Tue, 19 Feb 2013 23:04:31 GMT
Looking at the second line of the logs, it does not look like the case, but
it could be. do you have any suggestion for the flags in particular? I'm
running with -XX:+HeapDumpOnOutOfMemoryError  now, and I don't get any dump.

It could be the case, but the behavior is quite consistent currently, I'd
get a Child error/OOM pretty soon in these cases.


On Tue, Feb 19, 2013 at 11:45 PM, Avery Ching <aching@apache.org> wrote:

> Can you add GC flags?  Perhaps you are running into full GCs...
>
>
> On 2/19/13 1:45 PM, Claudio Martella wrote:
>
>> I have not investigated further yet, but this does not look connected to
>> ZK, but mostly to some concurrency on the queues. The number of requests
>> stays to 29...
>>
>>
>> On Tue, Feb 19, 2013 at 9:58 PM, Sebastian Schelter <ssc@apache.org>
>> wrote:
>>
>>  For me, Giraph on Hadoop 1.0.4 runs into some kind of deadlock after the
>>> input was loaded when using it with 200 workers on a billion-scale
>>> graph....
>>>
>>> I tried several versions of zookeeper, but nothing helped.
>>>
>>> Best,
>>> Sebastian
>>>
>>>
>>> On 19.02.2013 21:55, Claudio Martella wrote:
>>>
>>>> I'm running PR with trunk on yahoo webgraph across 528 workers and I get
>>>> the workers hanging at the flushing:
>>>>
>>>> 2013-02-19 21:40:07,187 INFO org.apache.giraph.graph.**ComputeCallable:
>>>>
>>> call:
>>>
>>>> Computation took 28.386961 secs for 95 partitions on superstep 0.
>>>>
>>>   Flushing
>>>
>>>> started
>>>> 2013-02-19 21:40:07,604 INFO org.apache.giraph.worker.**
>>>> BspServiceWorker:
>>>> finishSuperstep: Waiting on all requests, superstep 0 Memory
>>>> (free/total/max) = 5993.42M / 8661.44M / 14346.69M
>>>> 2013-02-19 21:40:07,605 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 558 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 6.4758, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0003, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 27.083
>>>> 2013-02-19 21:40:26,717 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 39 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 3.7965, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0.0001, MBytesSent = 175.3834, MBytesReceived = 0.0068, ave sent req
>>>>
>>> MBytes
>>>
>>>> = 0.2998, ave received req MBytes = 0, secs waited = 46.195
>>>> 2013-02-19 21:40:32,248 INFO
>>>> org.apache.giraph.comm.netty.**handler.RequestDecoder: decode: Server
>>>>
>>> window
>>>
>>>> metrics MBytes/sec sent = 0.0002, MBytes/sec received = 3.9824,
>>>>
>>> MBytesSent
>>>
>>>> = 0.0066, MBytesReceived = 155.4787, ave sent req MBytes = 0, ave
>>>>
>>> received
>>>
>>>> req MBytes = 0.0155, secs waited = 39.04
>>>> 2013-02-19 21:40:42,041 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 36 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 2.8509, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0.0001, MBytesSent = 175.3834, MBytesReceived = 0.0068, ave sent req
>>>>
>>> MBytes
>>>
>>>> = 0.2998, ave received req MBytes = 0, secs waited = 61.519
>>>> 2013-02-19 21:41:07,456 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 2.0174, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0.0001, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req
>>>>
>>> MBytes
>>>
>>>> = 0.2998, ave received req MBytes = 0, secs waited = 86.933
>>>> 2013-02-19 21:41:22,456 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 1.7205, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0.0001, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req
>>>>
>>> MBytes
>>>
>>>> = 0.2998, ave received req MBytes = 0, secs waited = 101.934
>>>> 2013-02-19 21:41:37,457 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 1.4998, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0.0001, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req
>>>>
>>> MBytes
>>>
>>>> = 0.2998, ave received req MBytes = 0, secs waited = 116.935
>>>> 2013-02-19 21:41:52,458 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 1.3293, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0.0001, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req
>>>>
>>> MBytes
>>>
>>>> = 0.2998, ave received req MBytes = 0, secs waited = 131.936
>>>> 2013-02-19 21:42:22,459 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 1.083, MBytes/sec received
>>>> =
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 161.937
>>>> 2013-02-19 21:42:37,460 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.9912, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 176.938
>>>> 2013-02-19 21:43:07,461 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.8475, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 206.939
>>>> 2013-02-19 21:43:37,462 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.7402, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 236.94
>>>> 2013-02-19 21:43:52,463 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.6961, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 251.941
>>>> 2013-02-19 21:44:07,464 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.657, MBytes/sec received
>>>> =
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 266.942
>>>> 2013-02-19 21:44:37,465 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.5906, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 296.943
>>>> 2013-02-19 21:45:07,466 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.5364, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 326.944
>>>> 2013-02-19 21:45:22,467 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.5129, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 341.945
>>>> 2013-02-19 21:45:37,468 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.4913, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 356.946
>>>> 2013-02-19 21:46:07,469 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.4532, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 386.947
>>>> 2013-02-19 21:46:22,470 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.4363, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 401.948
>>>> 2013-02-19 21:46:52,471 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.406, MBytes/sec received
>>>> =
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 431.949
>>>> 2013-02-19 21:47:07,472 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.3924, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 446.95
>>>> 2013-02-19 21:47:37,473 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.3677, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 476.951
>>>> 2013-02-19 21:48:07,474 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.346, MBytes/sec received
>>>> =
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 506.952
>>>> 2013-02-19 21:48:22,475 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.336, MBytes/sec received
>>>> =
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 521.953
>>>> 2013-02-19 21:48:52,476 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.3177, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 551.954
>>>> 2013-02-19 21:49:07,477 INFO org.apache.giraph.comm.netty.**
>>>> NettyClient:
>>>> waitSomeRequests: Waiting interval of 15000 msecs, 29 open requests,
>>>> waiting for it to be <= 0, MBytes/sec sent = 0.3093, MBytes/sec received
>>>>
>>> =
>>>
>>>> 0, MBytesSent = 175.3834, MBytesReceived = 0.0069, ave sent req MBytes =
>>>> 0.2998, ave received req MBytes = 0, secs waited = 566.955
>>>>
>>>> It looks similar to the problem we had earlier with concurrency.
>>>>
>>>> Are you guys experiencing something similar at large-scale?
>>>>
>>>>
>>>
>>
>


-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message