hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
Date Wed, 16 Dec 2015 08:25:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059687#comment-15059687

Varun Saxena commented on YARN-4224:

As we have a call today, I think we can discuss this in detail there.

I will consolidate the points for the sake of discussion.
* For Ember UI, hierarchical format of URL is not desirable. Refer to [comment above | https://issues.apache.org/jira/browse/YARN-4224?focusedCommentId=15056762&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15056762].
* So the proposal is to treat the parameters required to make a query as a tuple(represented
as a UID which has the these parameters delimited by some delimiter). Also the idea is to
fetch information in a hierarchical fashion. That is flows -> flowruns -> apps. So the
query flow would look something like this.
*Get flows*
The URL in this scheme will be same as before i.e. {{/ws/v2/timeline/flows}}. While returning
these flows we will also send a list of flowruns. Pls note this would be based on activity
on each date.
Now the proposal is that for each flow we can send a UID to aid further queries. This can
be filled in INFO field.
The UID will look like {{cluster_id|user_id|flow_name}} if pipe(|) is the delimiter.
We also return flow runs for each in the same query. So for each flowrun as well we can attach
a UID which would then be {{cluster_id|user_id|flow_name|flow_run_id}}

*Get flowruns*
Now to get flowruns for specific flow we can have endpoint as {{/ws/v2/timeline/flowruns/\[flow_UID\]}}
where flow_UID is what we returned in query above i.e. {{cluster_id|user_id|flow_name}}.
Similar to above here for each flowrun we will fill a UID as  {{cluster_id|user_id|flow_name|flow_run_id}}
IIUC, for a query(multiple records) UID may not be necessary in Ember but lets keep it for
consistency. Wangda can confirm though.

*Get single flowrun*
The endpoint here would be {{/ws/v2/timeline/flowrun/\[flowrun_UID\]}} where flowrun_UID is
what we returned in query above i.e. {{cluster_id|user_id|flow_name|flow_run_id}}.
Similar to above here for each flowrun we will fill a UID as  {{cluster_id|user_id|flow_name|flow_run_id}}
For Ember UI though, the call to get all flowruns may not be necessary as getFlows may suffice
to get flowruns for a flow.

*Get apps*
We can either get list of apps under a flow or a flowrun. Assuming UI will use the hierarchical
query, lets say we will query apps under flowrun.
So endpoint can be {{/ws/v2/timeline/flowrunapps/\[flowrun_UID\]}} where flowrun_UID is {{cluster_id|user_id|flow_name|flow_run_id}}.
Here we need to decide if we want to fill the UID or not. We can fill the UID for each app
as {{cluster_id|user_id|flow_name|flow_run_id|app_id}} but cluster id and appid should be
enough to query an app. We can have cluster id as an optional query param. But if we pass
flow information for this query, a peek into the flow context table wont be required. 
So need to discuss more.

*Get app*
To get an app we can either use the app_UID(containing flow) or just use the appid with cluster
as optional query param.
Endpoint can be either {{/ws/v2/timeline/app/\[app_UID\]}} or of the form {{/ws/v2/timeline/app/appid\{?clusterid=zzz\}}}.
This depends on whether we keep a flat URL structure or both

*Get entities*
Similar to get app in terms of UID requirements.
Endpoint can be either {{/ws/v2/timeline/entities/\[app_UID\]/entity_type}} or of the form
I have kept entity_type for this query in path as this cant be included in UID when we query
app. And entity_type is a mandatory param so it should ideally be in path.
As part of this query response we can construct a entity_UID of the form  {{cluster_id|user_id|flow_name|flow_run_id|app_id|entity_type|entity_id}}
or  {{cluster_id|app_id|entity_type|entity_id}} if we exclude flow context info.

*Get entity*
Using the UID returned above we can make the query to get a single entity.
But kindly note that we can query a single app attempt or single container as well from CLI.
In this case, this scheme wont work.

* As Sangjin said, we can support both the flat URL for UI and normal REST hierarchical URL
for other clients. This is what I lean towards as well. If that's done there were proposals
made in [this comment|https://issues.apache.org/jira/browse/YARN-4224?focusedCommentId=15052865&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15052865].
Although what I proposed will lead to shorter URL, but as Li pointed out the format he proposed
exists in AHS already. So I will go with it as well to keep it consistent all across.

> Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
> ------------------------------------------------------------------------------------
>                 Key: YARN-4224
>                 URL: https://issues.apache.org/jira/browse/YARN-4224
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-4224-YARN-2928.01.patch

This message was sent by Atlassian JIRA

View raw message