mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Hindman (JIRA)" <>
Subject [jira] [Commented] (MESOS-2735) Change the interaction between the slave and the resource estimator from polling to pushing
Date Wed, 20 May 2015 00:43:00 GMT


Benjamin Hindman commented on MESOS-2735:

There are definitely differences in message queue behavior, one of which is significantly
safer than the other. There are two safety concerns that I can think of, one of which [~jieyu]
has addressed here but I'll repeat to be sure I properly understood.

(1) Someone might write a ResourceEstimator that isn't asynchronous, causing the slave to
"block" while the resource estimator estimates.

(2) The ResourceEstimator might cause a denial of service attack on the slave.

I understand the concern with (1) but I'm not too anxious about it. Why? It should be trivial
to make a wrapper module which forces people to implement the ResourceEstimator to be asynchronous,
either using `async` like you suggested or implementing a version of ResourceEstimator which
wraps an actor (libprocess process). We'll only need to do this once and then other ResourceEstimator
implementations can leverage this stuff.

On the other hand, I don't like the behavior of push because of (2). Fundamentally, if the
slave can't keep up with the rate at which the ResourceEstimator is pushing then we could
create a denial of service issue with the slave, i.e., it takes a long time to process non-ResourceEstimator
messages because it's queue is full of just ResourceEstimator messages. I'm more anxious about
(2) than (1) because it's harder to find bugs in (2) than with (1) since once you fix (1)
it stays fixed forever but any time you updated the algorithm you impact the potential to
cause (2).

Now, I acknowledge that implementing this as a pull versus push will make the implementation
in the ResourceEstimator slightly more complicated, but not really. In particular, it should
be trivial to always use a `Queue` to achieve the push semantics in any ResourceEstimator
implementation, while still providing the pull semantics externally. Make sense?

Finally, one of the advantages of the pull model is that it's easier to reason about because
we don't have "anonymous" lambdas that cause execution in some other random place in the code
(i.e., you can easily see in the slave where the future that gets returned from `ResourceEstimator::estimate()`
gets handled). In addition, the ResourceEstimator remains "functional" in the sense that it
just has to return some value (or a future) from it's functions versus invoking some callback
that causes something to get run some other place (and in fact, may also block, so isn't it
safer for the ResourceEstimator to invoke the callback in it's own `async`?).

The invocation of the `ResourceEstimator::estimate()` followed by the `.then` is a nice pattern
that let's us compose with other things as well, which is harder to do with the lambda style
callbacks and why we've avoided it where we've been able (in fact, I'm curious which place
in the code are you imitating here?).

> Change the interaction between the slave and the resource estimator from polling to pushing

> --------------------------------------------------------------------------------------------
>                 Key: MESOS-2735
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Jie Yu
>            Assignee: Jie Yu
>              Labels: twitter
> This will make the semantics more clear. The resource estimator can control the speed
of sending resources estimation to the slave.
> To avoid cyclic dependency, slave will register a callback with the resource estimator
and the resource estimator will simply invoke that callback when there's a new estimation
ready. The callback will be a defer to the slave's main event queue.

This message was sent by Atlassian JIRA

View raw message