mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kone <vinodk...@gmail.com>
Subject Re: resources not offered to framework
Date Tue, 18 Aug 2015 14:15:49 GMT
You can start the master (and marathon) with GLOG_v=1 in the environment to enable more verbose
logging. 

HTH,

@vinodkone

> On Aug 18, 2015, at 5:10 AM, Mike Barborak <MikeB@MindLakes.com> wrote:
> 
> Thanks. In testing my new setup using roles, I’m having a problem with Marathon not
being offered any resources. I’ve posted a question on the Marathon forums:
>  
> https://groups.google.com/forum/?hl=en#!topic/marathon-framework/bQ2pO5Dk2MA
>  
> but am not getting any replies so was wondering if there is guidance for understanding
from a Mesos perspective why a framework (Marathon here) is not getting resource offers. (Btw,
my own custom framework does get offers – just not Marathon.) Is there a set of command
line options that will reveal the master’s resource offer process to the point that the
problem will be revealed? Or is there a trouble shooting guide that provides understanding
around not getting resource offers?
>  
> Sorry to be so non-specific – I’m a few days into this and starting to grasp.
>  
> Thanks,
> Mike
>  
> From: Vinod Kone [mailto:vinodkone@gmail.com] 
> Sent: Friday, August 14, 2015 11:37 AM
> To: user@mesos.apache.org
> Subject: Re: resources not offered to framework
>  
> If the currently running tasks do not have checkpointing turned on, they cannot reconnect
to a restarted slave no matter what. 
>  
> And yes currently you can't change the Slave resources roles without wiping metadata.

> 
> @vinodkone
> 
> On Aug 14, 2015, at 6:14 AM, Mike Barborak <MikeB@MindLakes.com> wrote:
> 
> I’ve made the changes to my frameworks and Marathon to use roles. My question is, is
there a way to change a slave’s role without restarting it? I ask because the slave I want
to reconfigure is running frameworks that scheduled tasks that take a very long time to complete
their work. These frameworks do not have checkpointing turned on. (I’ve changed the code
so that they will in the future.) My understanding and experience tell me that to change the
slave’s configuration, I have to restart the slave and that when I do that I will get a
log message saying I have to rm –f /tmp/mesos/meta/slaves/latest. After I do that and restart,
I believe the running frameworks will not reconnect with the slave (does that sound right?)
and will timeout and shut down along with the tasks they scheduled and that is what I’m
trying to avoid.
>  
> Thanks,
> Mike
>  
> From: Mike B 
> Sent: Tuesday, July 14, 2015 5:33 PM
> To: user@mesos.apache.org
> Subject: RE: resources not offered to framework
>  
> I didn’t understood the difference between roles and attributes. That sounds like what
I am looking for. Thanks for your help.
>  
> -Mike
>  
> From: Vinod Kone [mailto:vinodkone@gmail.com] 
> Sent: Tuesday, July 14, 2015 4:37 PM
> To: user@mesos.apache.org
> Subject: Re: resources not offered to framework
>  
>  
> On Tue, Jul 14, 2015 at 4:36 AM, Mike B <MikeB@mindlakes.com> wrote:
> I could see the master processing ACCEPT calls for offers and I could see the resources
associated with the new slave being recovered because none of the frameworks they were offered
to wanted them. What I never saw was these new  resources being offered to the framework that
could have used them. Ideally, I would have liked these new resources to have been offered
to that framework. (One note, another instance of the same framework was launched after seeing
this problem and it was offered these new resources.)
>  
> 
> I imagine this could be possible with the built-in allocator if the framework (say F)
that needed the "worker" resources had a high DRF share and other frameworks had a low DRF
share. If the frameworks that do not need "worker" resources do not filter them for long enough
(refuse_seconds is small) time, they might repeatedly become candidates for allocation starving
out F.
>  
> Couple of options here.
> --> You can have frameworks that are not interested in "worker" resources decline
offers (with "worker" resources) with a very long interval (say 1 year).
>  
> --> Instead of attributes, use roles (role: worker, role: workstation etc) and have
framework F register with role "worker".
>  

Mime
View raw message