aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zameer Manji <zma...@apache.org>
Subject Re: Review Request 56723: Add best effort pulse timestamp recovery.
Date Thu, 16 Feb 2017 02:24:47 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56723/
-----------------------------------------------------------

(Updated Feb. 15, 2017, 6:24 p.m.)


Review request for Aurora, David McLaughlin and Santhosh Kumar Shanmugham.


Bugs: AURORA-1890
    https://issues.apache.org/jira/browse/AURORA-1890


Repository: aurora


Description
-------

Currently the scheduler causes all coordinated ("pulsed") updates into
ROLL_FORWARD_AWAITING_PULSE, or ROLL_BACK_AWAITING_PULSE on scheduler
startup/recovery. This is because the last pulse timestamp is not durably stored
and the timestamp of the last pulse is set to 0L (aka no pulse yet).

In cases where the pulse timeout is larger and the failover is fast or frequent,
this casues many updates to unnecessarily transition into a pulse related state
until the next pulse.

It is posible to avoid these uncessary transitons by traversing the job update
events and initializing the last pulse timestamp to the last event if the last
event was not a pulse event.


Diffs (updated)
-----

  api/src/main/thrift/org/apache/aurora/gen/api.thrift efd4e534c4ad90862d7a9fae437ed724da3a34dc

  src/main/java/org/apache/aurora/scheduler/base/Jobs.java 49e5b2cfc0b84bb0e0c95cca375cd0503f9dcdb5

  src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java 729c1234a2e27f1e756ddfd6a4e5a04fa20bbd7f

  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java ea0b89a232c2fc10f2183218b750bb0478d51a58


Diff: https://reviews.apache.org/r/56723/diff/


Testing
-------


Thanks,

Zameer Manji


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message