giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugene Koontz <ekoo...@hiro-tan.org>
Subject Re: Review Request: Adding calls to progress during ZooKeeper barrier waits allows heathly jobs to progress to completion without timing out on some Hadoop clusters.
Date Wed, 18 Jul 2012 00:58:58 GMT
Hi Eli,
	I got a 404 on :
https://reviews.apache.org/r/6026/diff/


I can see the review itself: https://reviews.apache.org/r/6026/

but no way to review the diff inline.

-Eugene

On 7/17/12 5:37 PM, Eli Reisman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6026/
> -----------------------------------------------------------
> 
> Review request for giraph.
> 
> 
> Description
> -------
> 
> Simply repeats a pattern of simple changes to places where an idle worker might wait
uninterrupted in a PredicateLock (awaiting BspEvent in BspServiceWorker in most cases) for
a Znode to be published that allows it to progress onward. Without occasional calls to context.progress()
in these waits, otherwise healthy jobs can time out due to idle workers not heartbeating to
the underlying Hadoop system. This patch still allows timeouts, but other when the workers
have actually failed.
> 
> 
> Diffs
> -----
> 
> 
> Diff: https://reviews.apache.org/r/6026/diff/
> 
> 
> Testing
> -------
> 
> July 14, 15, 16th on cluster with variety of data loads and memory/worker constraints.
Passes 'mvn verify' etc.
> 
> 
> Thanks,
> 
> Eli Reisman
> 
> 



Mime
View raw message