aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zameer Manji <zma...@apache.org>
Subject Re: Review Request 57061: Enable Mesos HTTP API.
Date Tue, 28 Feb 2017 00:43:19 GMT


> On Feb. 27, 2017, 3:22 p.m., Stephan Erb wrote:
> > In general this looks good to me. I am not an expert on the HTTP API though, so
I cannot really say if we are using it correctly in all corner cases. I trust your judgement
here :-)
> > 
> > However, I have to admit I am not super thrilled about the merged SchedulerDriver.
The new HTTP API is the one to stay in the future, so by keeping it separate we would have
(potentially) achieved a cleaner code base and would have made it simpler to drop the old
API in the future. While I achknowledge that this approach has reduced duplication, I don't
believe it outweights the added complexity of the merged implementation.
> > 
> > I am happy to here what you think about this.

Do you mean merged `MesosSchedulerImpl`? I agree merging it makes the code complex, but my
first two cuts on seperate implementations were not good and made me feel uncomfortable. There
is a lot of subtle logic in the implementation, including event firing and metrics processing.

I feel that it's pretty easy to introduce a regression, and I did introduce several in local
prototyping, mostly involving metrics.

I would feel ok splitting apart the code if there was a good way to maximize code reuse. If
you have a pattern/design in mind please hare.


> On Feb. 27, 2017, 3:22 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImpl.java, line 380
> > <https://reviews.apache.org/r/57061/diff/4/?file=1650744#file1650744line380>
> >
> >     Would be great if we could log more info about the master we have connected
to.

This is not clear but we do not get the "MasterInfo" object until we do Subscribe call and
get a Subscribed event back. See `handleRegistration` for more information. There we log the
MasterInfo object and all other data.

I believe with GLOG_v=1 the native code logs more information when connecting but alas we
do not have access to that data here.


> On Feb. 27, 2017, 3:22 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImpl.java, lines 470-471
> > <https://reviews.apache.org/r/57061/diff/4/?file=1650744#file1650744line470>
> >
> >     Should we add some debug logging here? Or is this too frequent even for debug?
> 
> Stephan Erb wrote:
>     And/or we could add a counter to track the health of the scheduler connection via
/vars.

The default frequency of heartbeats seems to be every 15 seconds so it is way too frequent
to log.

However, I will add metrics for each event recieved.


> On Feb. 27, 2017, 3:22 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImpl.java, line 474
> > <https://reviews.apache.org/r/57061/diff/4/?file=1650744#file1650744line474>
> >
> >     Nitpick: You are somewhat inconsistent with how you format log messages.

Fixed. Thanks for the catch. I prefer to use SLF4J formatting.


> On Feb. 27, 2017, 3:22 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/VersionedSchedulerDriverService.java,
lines 91-94
> > <https://reviews.apache.org/r/57061/diff/4/?file=1650748#file1650748line91>
> >
> >     The log messages are duplicated (see V1 scheduler methods). Can we drop them
in one place?

You are right that these are duplicated. I'm not sure how to fix this.

So if you look carefully at the `VersionedDriverFactory` implementations in `LibMesosLoadingModule`
you will see that the `V0Mesos` takes in the framework info like the scheduler driver. This
means that if you are using the V0 driver you are registering on startup of the service. However,
`V1Mesos` doesn't and we are doing registration with the SUBSCRIBE call after we connect.

So technically we are doing both things in different places depending on the implementation
so I feel that the logging should be here, but I'm unsure on how to deduplicate it cleanly.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57061/#review166958
-----------------------------------------------------------


On Feb. 27, 2017, 1:44 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57061/
> -----------------------------------------------------------
> 
> (Updated Feb. 27, 2017, 1:44 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Mehrdad Nurolahzade, and Stephan Erb.
> 
> 
> Bugs: AURORA-1887 and AURORA-1888
>     https://issues.apache.org/jira/browse/AURORA-1887
>     https://issues.apache.org/jira/browse/AURORA-1888
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This patch completes the design doc[1] and enables operators to choose between
> two V1 Mesos API implementations. The first is `V0Mesos` which offers the V1 API
> backed by the scheduler driver and the second is `V1Mesos` which offers the V1
> API backed by a new HTTP API implementation.
> 
> There are three sets of changes in this patch.
> 
> First, the V1 Mesos code requires a Scheduler callback with a different API. To
> maximize code reuse, `MesosSchedulerImpl` was extended to implement the new
> callback as well as the old callback. The code and tests were reshufled to
> maxmize logic reuse.
> 
> Second, a new driver implementation using the new API was created. All of the
> logic for the new driver is encapsulated in the
> `VersionedSchedulerDriverService` class.
> 
> Third, some wiring changes were done to allow for Guice to do it's work and
> allow for operators to select between the different driver implementations.
> 
> [1] https://docs.google.com/document/d/1bWK8ldaQSsRXvdKwTh8tyR_0qMxAlnMW70eOKoU3myo
> 
> 
> Diffs
> -----
> 
>   examples/vagrant/upstart/aurora-scheduler.conf 49fdcbd8b7406a59ae7882473b9eddbfce3ece7c

>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java e2ef9c30720698263106f22e3e24db5d0468b155

>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 805e9de9bc45396cb8fc6e33ddb3d7428312c608

>   src/main/java/org/apache/aurora/scheduler/mesos/LibMesosLoadingModule.java e1a23590c795a489e753b77b0835d30d3be174b5

>   src/main/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImpl.java eb210962c54cd0d33e3760b32f5b0ca1a7079204

>   src/main/java/org/apache/aurora/scheduler/mesos/ProtosConversion.java bc9e23b7410c00b7d5ffa4f23db93a51e9d0f405

>   src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverModule.java 5519323079b2c957a23e093dcc77929148b4a59a

>   src/main/java/org/apache/aurora/scheduler/mesos/VersionedDriverFactory.java PRE-CREATION

>   src/main/java/org/apache/aurora/scheduler/mesos/VersionedSchedulerDriverService.java
PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 05518048ca5518a007281269aa402a7d0710eb62

>   src/test/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImplTest.java c599fe30bc903b3a3fb178df70a46d2421b6c45e

>   src/test/java/org/apache/aurora/scheduler/mesos/VersionedSchedulerDriverServiceTest.java
PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/thrift/ThriftIT.java f2275c757ebfa52179e31b95bf0c02b6753fb7e3

> 
> Diff: https://reviews.apache.org/r/57061/diff/
> 
> 
> Testing
> -------
> 
> The e2e test has been run three times, each time with a different driver option.
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message