giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman" <initialcont...@gmail.com>
Subject Review Request: Adding calls to progress during ZooKeeper barrier waits allows heathly jobs to progress to completion without timing out on some Hadoop clusters.
Date Wed, 18 Jul 2012 00:37:05 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6026/
-----------------------------------------------------------

Review request for giraph.


Description
-------

Simply repeats a pattern of simple changes to places where an idle worker might wait uninterrupted
in a PredicateLock (awaiting BspEvent in BspServiceWorker in most cases) for a Znode to be
published that allows it to progress onward. Without occasional calls to context.progress()
in these waits, otherwise healthy jobs can time out due to idle workers not heartbeating to
the underlying Hadoop system. This patch still allows timeouts, but other when the workers
have actually failed.


Diffs
-----


Diff: https://reviews.apache.org/r/6026/diff/


Testing
-------

July 14, 15, 16th on cluster with variety of data loads and memory/worker constraints. Passes
'mvn verify' etc.


Thanks,

Eli Reisman


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message