hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4879) Proposal for a simple (delta) allocate protocol
Date Thu, 31 Mar 2016 19:04:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220475#comment-15220475
] 

Vinod Kumar Vavilapalli commented on YARN-4879:
-----------------------------------------------

Tx for the doc, [~subru] and [~asuresh]! +1 overall for a unique identifier.

h4. Comments on your doc

 - I'd rather call it "an enhancement to identify requests explicitly" instead of "simple
(delta) allocate protocol". We used to use the phrase "delta protocol" in a slightly different
context - see YARN-110.
 - bq. The RM will attempt to allocate containers in decreasing sequence number order,
Why are we putting priority semantics onto the ID? We should just follow the existing priority
ordering.
 - bq.  In our proposal, we could potentially have requests for each container at worst case.

It is both network / memory overhead as well as scheduler's CPU time. Till we move off to
global scheduling completely, we should be cautious about this. Of course, by inverting the
ResourceRequest and still keying by ResourceName in the API, we are limiting the total entries
to be of the order of the cluster-size.
I already suggested on YARN-1547 that we also have an upper limit on the total number of requests
- see [here|https://issues.apache.org/jira/browse/YARN-1547?focusedCommentId=15218681&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15218681].
But I strongly suggest that we have additional limits on the total number of IDs that can
be used - this will fit our narrative at YARN-4902 too.

h4. Comments from YARN-4902

Copy-edit-pasting here a few comments that we posted in the document for YARN-4902, and those
I think were not laid out in the doc explicitly. We were calling it Allocation-ID there, I
guess I now like Request-ID better. If some or all of them make sense, you can add them to
your doc
 - *Scope*: This ID is a unique identifier for different ResourceRequests from the *same application*
- essentially IDs can conflict across applications.
 - *Generation*: The application should simply generate a unique identifier within the application
- if not the client-libraries can do so if desired by the application.
 - *Non-binding nature*: Applications can continue to completely ignore the returned Allocation-ID
in the response and use the allocation for any of their outstanding requests
 - *Responses*: The scheduler may return multiple responses corresponding to the same Allocation-ID
- as and when scheduler returns allocations
 - *Deeper details on updates*: Similar to the current API, update of only selected fields
against a previously existing Allocation-ID will only update the object (as opposed to replacing
it). For e.g, say a ResourceRequest first gets created with Allocation-ID "76589" and with
_"host: *"_. A future ResourceRequest with the same Allocation-ID but with contents _“rack05:
10”_ will only append the rack information to the existing object. This is how one can replace
parts of an object and is similar to how the existing per-record-deltas based protocol works.
 - *Deletes*: Similarly, if one wishes to replace an entire ResourceRequest corresponding
to a specific allocation-ID, they can simply cancel the corresponding ResourceRequest and
submit a new one afresh.

h4. Other responses
bq.  If a node local allocation is made for node N1, we can immediately lookup the entries
for rack and ANY by using the ID key and decrement them instead of linearly scanning the rack/ANY
entries.
+1, ID is really the logical grouping key.

bq. While making these changes, would it possible to address YARN-314 too? 
I'm okay if we can get two in a shot, but I'd caution against risking this effort by blowing
up the size.

> Proposal for a simple (delta) allocate protocol
> -----------------------------------------------
>
>                 Key: YARN-4879
>                 URL: https://issues.apache.org/jira/browse/YARN-4879
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: applications, resourcemanager
>            Reporter: Subru Krishnan
>            Assignee: Subru Krishnan
>         Attachments: SimpleAllocateProtocolProposal-v1.pdf
>
>
> For legacy reasons, the current allocate protocol expects expanded requests which represent
the cumulative request for any change in resource constraints. This is not only very difficult
to comprehend but makes it impossible for the scheduler to associate container allocations
to the original requests. This problem is amplified by the fact that the expansion is managed
by the AMRMClient which makes it cumbersome for non-Java clients as they all have to replicate
the non-trivial logic. In this JIRA, we are proposing a delta allocate protocol where the
AM will need to only specify changes in resource constraints.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message