mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.
Date Tue, 14 Feb 2017 19:47:12 GMT


> On Feb. 13, 2017, 11:51 a.m., Joseph Wu wrote:
> > src/master/master.cpp, lines 4756-4777
> > <https://reviews.apache.org/r/56587/diff/1/?file=1631678#file1631678line4756>
> >
> >     Moving the error above the CHECK leaves out the log message in one case:
> >     
> >     * ACCEPT_INVERSE_OFFER is called with multiple offers (invalid-offer-id and
valid-offer-id).
> >     * In the processing loop, we will LOG(WARNING) for `invalid-offer-id`.
> >     * After the processing loop, we _should_ LOG(INFO) for `valid-offer-id` with
the agent and framework info too.  <-- This one is left out in your patch
> 
> Anand Mazumdar wrote:
>     hmm, the semantics that we have currently for the `accept()`/`acceptInverseOffers()`
is that even if we have a _single_ invalid offer we treat the entire request as being invalid.
This seems like the right behavior to me i.e., we log that the call used invalid offers if
any of the offers are invalid. If we feel otherwise though, that seems like a separate orthogonal
issue that we should tackle for both the handlers.

It feels right to reject the entire array of offers if any of them are invalid.

The problem with inverse offers is that there is no feedback to the framework.  If a framework
accidentally formats all InverseOfferIDs incorrectly, we have no way of indicating this to
the framework.  For normal offers, the master can send back a status update with TASK_DROPPED/TASK_LOST.
^ This is why we try to process all the valid InverseOfferIDs even if part of the request
is invalid.


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/#review165383
-----------------------------------------------------------


On Feb. 12, 2017, 9:44 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56587/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2017, 9:44 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-7119
>     https://issues.apache.org/jira/browse/MESOS-7119
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The erroneous invariant check for `slaveId` can be trigerred when
> the master accepts an invalid inverse offer or when the inverse offer
> has been already rescinded.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
>   src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 
> 
> Diff: https://reviews.apache.org/r/56587/diff/
> 
> 
> Testing
> -------
> 
> make check + added a test that used to crash before.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message