aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maxim Khutornenko" <ma...@apache.org>
Subject Re: Review Request 30225: Modifying update controller to support heartbeats.
Date Wed, 11 Feb 2015 19:17:50 GMT


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
lines 282-292
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line282>
> >
> >     There's a race on access to `pulseStates`.  Terminating an update between :282
and :287 would cause a stale entry in the map (`storage.read` doesn't offer a lock, but we
shouldn't count on that anyway).  Consider `synchronized (pulseStates)`.
> >     
> >     IMHO, though, this points out that this class is difficult to reason about now
that multiple disparate locks are in play.  Did you consider extracting an inner class to
use?  This would be different from the fully-isolated class we discussed before, where you
can share implementation details and accept high coupling, but at least isolate management
of `pulseStates` and manage it with higher-level operations from the rest of the class.  Seems
like you've already done this to a degree by keeping methods managing this map near each other.
> >     
> >     Your call on this.

The read lock is no more but old habits die hard. You're right, it's no longer required after
removing all DB access from this method. Dropped the storage code and consolidated all map
access in a private class.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 277
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line277>
> >
> >     `storeProvider` is not used, do you need the `storage.read` at all?

Dropped.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStore.java, line
166
> > <https://reviews.apache.org/r/30225/diff/6/?file=853464#file853464line166>
> >
> >     s/query/fetch/

That's already taken. Renamed to "job_update_store_fetch_details_list".


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 123
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line123>
> >
> >     Please document this map more thoroughly.  How are entries added here, how are
they removed, etc.
> >     
> >     For example, one useful detail is that we retain an entry for an update that
is paused.

Done.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 124
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line124>
> >
> >     Please include in test coverage some validation of the contents of this map
so we can gain confidence that we do not have a memory leak.

It was already tested by the following assertion: `assertEquals(JobUpdatePulseStatus.FINISHED,
updater.pulse(UPDATE_ID));`. This would never return FINISHED if there was an entry in the
map.

Added to all pulse tests to make sure all scenarios are validated.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 305
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line305>
> >
> >     Coverage report indicates this line is not covered.  Can you try to cover it?

Done.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 488
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line488>
> >
> >     Shouldn't this be in an `else`?  Looks like we've just wiped a pulse record
for terminal updates, but we add another here.

It's updated conditionally if PulseState exists. Added else to make it more clear.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 545
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line545>
> >
> >     This only has one caller, consider inlining.

Yeah, there is only one caller now after refactoring. Merged.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 547
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line547>
> >
> >     Coverage report indicates an uncovered branch here.  Can you try to cover it?

This one is tricky as it's yellow due to `pulse == null`. Given the trivial nature of this
branch and complexity of covering it, I'd like to punt it.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 721
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line721>
> >
> >     At risk of being overly-verbose, how about `initializePulseMonitor`?  A quick
skim of `initializePulse` makes it sound like it might be simulating an inital pulse.

How about `initializePulseState`? This seems to fit well with the new refactored implementation.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java,
line 776
> > <https://reviews.apache.org/r/30225/diff/6/?file=853467#file853467line776>
> >
> >     Your call, but you could massage out the `clock` if you store `expirationMs`
and have `getExpirationMs()` instead of storing `lastPulseMs`.

I'd rather keep the last pulse instead as it will be better for debugging endpoint coming
in AURORA-1103.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachine.java, line
165
> > <https://reviews.apache.org/r/30225/diff/6/?file=853468#file853468line165>
> >
> >     This is never used as a `Function`, strongly consider converting to a method.
 Ditto elsewhere if it applies.

Converted to a method.


> On Feb. 11, 2015, 1:03 a.m., Bill Farner wrote:
> > src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java,
lines 700-726
> > <https://reviews.apache.org/r/30225/diff/6/?file=853470#file853470line700>
> >
> >     I think you've done the right thing, but note that this unit test is inconsistent
with your reply on review https://reviews.apache.org/r/30804/

I still stand by that comment as it's applicable to thrift tests that don't require extensive
setup. It's quite the opposite here where setup is complicated and every test takes a perf
toll to run.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30225/#review71879
-----------------------------------------------------------


On Feb. 7, 2015, 2:43 a.m., Maxim Khutornenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30225/
> -----------------------------------------------------------
> 
> (Updated Feb. 7, 2015, 2:43 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Bill Farner.
> 
> 
> Bugs: AURORA-1010
>     https://issues.apache.org/jira/browse/AURORA-1010
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Added pulsing support into the JobUpdateController. The qualified coordinated updates
get blocked until a pulse arrives. An update then becomes active and proceeds until `blockIfNoPulsesAfterMs`
expires or the update reaches a terminal state (whichever comes first).
> 
> UPDATE:
> - Added explicit states to capture "blocked" updates
> - Refactored pulseUpdate() to not rely on DB state
> - Dropped JobUpdatePulseState.PAUSED as it does not seem necessary.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 4f603f9e7ed004e53937a45ea8edf7241e15f5cf

>   src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java 7f2ec71d744d5fafac84291cc79feba3398d0e1e

>   src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java b7d8d524e15f101416889c00efc3ecf2c8d9c1a1

>   src/main/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStore.java d479d20259f284528b2291e2e486b36d8e47fe5e

>   src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java 60f535988a20638fb16515d5027bfa65f1afb73c

>   src/main/java/org/apache/aurora/scheduler/updater/JobUpdateController.java d3b30d48b76d8d7c64cda006a34f7ed3296526f2

>   src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java ec9d1e07abca1bf30262e1c0f741a34741e96f5d

>   src/main/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachine.java 76460f95e058181b711fb6869f5a34c1d5bdfe31

>   src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
f9c9ceddc559b43b4a5c45c745d54ff47484edde 
>   src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java ca7c0c2675477cc727ca006697665f997972dfde

>   src/test/java/org/apache/aurora/scheduler/testing/FakeScheduledExecutor.java 92cfa2b30c1c4daa3ae225fc8609fbeaecdaff7a

>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachineTest.java 89765ac3d34a827d3748fb96a275d78e9d1b8b72

>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 41d422939209d0808235128e4242c11e8ef25969

> 
> Diff: https://reviews.apache.org/r/30225/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew -Pq build
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message