mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Massenzio <m.massen...@gmail.com>
Subject Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.
Date Fri, 08 Jan 2016 20:37:56 GMT
+1
(my two cent is that the “correct” approach from an operations viewpoint is to first query
for the leader, then ask the leader; shortcoming identified by Ben obvious, but possibly the
lesser of the two evils - and probably unavoidable in a distributed systems without atomic
transactions - which I don’t think anyone on this list would advocate for?)

Thanks to the Benjamin(s) for (finally) giving a name to something I have encountered often
:)
(I used to informally call it “the A-B problems” - your naming is definitely more compelling!)

> On Jan 8, 2016, at 12:29 PM, Benjamin Mahler <bmahler@apache.org> wrote:
> 
> Some feedback on this ticket: it focuses on the solution rather than the
> problem. We generally want to avoid this, I guess it's been coined 'The XY
> Problem' (thanks Benjamin Bannier). In this case it turns out that there
> are actually 2 distinct problems that the user is facing:
> 
> (1) Passive masters return information in some endpoints that can be
> interpreted as incorrect. A passive master does not know the list of tasks,
> for example, and so returning an empty list is less accurate than
> expressing that no response is possible.
> 
> (2) It is difficult to reliably obtain cluster state through the existing
> endpoints. This one is less clear to me than the first problem. Here we
> have to think through how we want users to be hitting state endpoints. Do
> they hit all the masters and take the first valid response? Do they first
> ask for the leader, then query the leader? Both of these have races (the
> first case has an issue that the requests are not atomic, you may receive
> two valid responses ; the second case the leader information may become
> stale before the second request). Do we add redirects? Even redirects have
> issues, there may be multiple redirects, there may be a redirect to a
> master that is unable to redirect further (and so we haven't really solved
> the race difficulties with redirects).
> 
> The point is, it looks like we can easily solve (1), but (2) warrants more
> thought and will be easier to assess with the problem well understood.
> 
> On Wed, Jan 6, 2016 at 12:52 PM, Diogo Gomes <diogomes@gmail.com> wrote:
> 
>> Hi, Adam and Haosdent
>> 
>> 
>> Resurrecting this issue, https://issues.apache.org/jira/browse/MESOS-1865,
>> I would like to make a +1 for this change, which apparently became cold but
>> I think is very relevant and we had enough time to be prepared for a change
>> like this, right?
>> 
>> 
>> If necessary, can I help with something?
>> 
>> 
>> Diogo Gomes
>> 
>> 
>> 
>> 
>> 


Mime
View raw message