flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ufuk Celebi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-3003) Add container allocation timeout to YARN CLI
Date Wed, 11 Nov 2015 10:06:10 GMT
Ufuk Celebi created FLINK-3003:
----------------------------------

             Summary: Add container allocation timeout to YARN CLI
                 Key: FLINK-3003
                 URL: https://issues.apache.org/jira/browse/FLINK-3003
             Project: Flink
          Issue Type: Improvement
          Components: YARN Client
    Affects Versions: 0.10
            Reporter: Ufuk Celebi
             Fix For: 1.0, 0.10.1


Programs submitted via {{bin/flink run -m yarn-cluster}} start a short-lived YARN sessions
before submitting the job. The job is only submitted when all resources have been allocated.
All allocated containers are "blocked" by the to be submitted job and the cluster is only
partially allocated.

If you have multiple submissions like this with partial allocations, you can block the whole
YARN cluster (e.g. 10 containers in total and two sessions want 6 containers each and both
have allocated 5).

A simple work around for these situations is to add an allocation timeout after which the
YARN sessions fails and releases all the resources.

[Other strategies like wait for X amount of time for Y containers, but then go with what you
have if you don't get all are also possible.]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message