flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eron Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7615) Under mesos when using a role, TaskManagers fail to schedule
Date Wed, 13 Sep 2017 23:37:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165474#comment-16165474

Eron Wright  commented on FLINK-7615:

[~addisonj@gmail.com] thanks for the report.   I think this is a duplicate of FLINK-7294 which
is close to being fixed.  Feel free to review the PR submitted under that ticket.

> Under mesos when using a role, TaskManagers fail to schedule
> ------------------------------------------------------------
>                 Key: FLINK-7615
>                 URL: https://issues.apache.org/jira/browse/FLINK-7615
>             Project: Flink
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 1.3.2
>            Reporter: Addison Higham
> When `mesos.resourcemanager.framework.role` is specified, TaskManagers are unable to
start. An error message is given that indicates that the request resources can be satisfied.
I sadly lost the logs, but essentially it appears that an offer extend by mesos is accepted,
but the request being made for resources under the default role (of `*`) but if the resources
offered all exist under the role. 
> I believe this is likely to do with the fact that while the framework properly starts
under the specified role (meaning it only gets offers of the specified role), it isn't making
`Protos.Resource` objects with a role defined.
> This can be seen here: https://github.com/apache/flink/blob/release-1.3.2/flink-mesos/src/main/java/org/apache/flink/mesos/Utils.java#L72
> The mesos docs for the `Resource.Builder.setRole` (http://mesos.apache.org/api/latest/java/org/apache/mesos/Protos.Resource.Builder.html#setRole-java.lang.String-)
allow for a role to be provided. (Note, this method is shown as deprecated for mesos 1.4.0,
but for the current version flink uses of 1.0.1, this method is the only mechanism)
> I believe this should mostly be fixed by something like this:
> {code:java}
> /**
> 	 * Construct a scalar resource value.
> 	 */
> 	public static Protos.Resource scalar(String name, double value, Option<String>
role) {
> 		Protos.Resource.Builder builder = Protos.Resource.newBuilder()
> 			.setName(name)
> 			.setType(Protos.Value.Type.SCALAR)
> 			.setScalar(Protos.Value.Scalar.newBuilder().setValue(value));
> 		if (role.isDefined()) {
> 			builder.setRole(role.get());
> 		}
> 		return builder.build();
> 	}
> {code}
> However, perhaps we want to consider upgrading to mesos 1.4.x that has the newer API
for this (http://mesos.apache.org/api/latest/java/org/apache/mesos/Protos.Resource.ReservationInfo.Builder.html#setRole-java.lang.String-)

> In looking at the other options for ReservationInfo, I don't see any current need to
expose any of those parameters for configuration, but perhaps some FLIP-6 work could benefit.
> [~till.rohrmann] any thoughts? I can implement a fix as above against mesos 1.0.1, but
figured I would get your input before submitting a patch for this

This message was sent by Atlassian JIRA

View raw message