hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-6861) Reader API for sub application entities
Date Thu, 10 Aug 2017 19:23:01 GMT

    [ https://issues.apache.org/jira/browse/YARN-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122159#comment-16122159
] 

Vrushali C edited comment on YARN-6861 at 8/10/17 7:22 PM:
-----------------------------------------------------------

[~rohithsharma] [~varun_saxena] and I discussed this in our weekly call today. The discussion
was about the naming of these apis. The consensus was, as of now, we will proceed with this
patch.

We discussed if we should call it something other than /users/<user>/entities/<entityid>/
to indicate that these are entities that are being queried for without knowledge of the yarn
application id. 

At present, these apis will return sub-application entities. For example, a query that an
user "userA" runs on a Tez setup. This user is different from the user, say user "userYARN"
who is running the Tez AM. 

Note 1: 
Entities from only such queries will go to two places in the backend: 
- in the entity table within the context of  an application: {code}   userYARN / cluster/
flow / flowrun id / appid / entity  {code}
- in the sub application table outside the context of an application:   {code} userA / cluster
/ entity  {code}

Note 2: 
In this same example, the Tez AM itself writes some lifecycle events and metrics of it's containers.
These will go only to entity table for user "userYARN". 

The reader APIs in this patch are going to return data that belongs to the context of entities
stored outside of an application, that is, from the sub application table. 

The reader APIs like GET /ws/v2/timeline/clusters/{cluster name}/apps/{app id}/entities/{entity
type}
 or GET /ws/v2/timeline/apps/{app id}/entities/{entity type} will return all entities, that
is, entities written  in "Note 1" as well as written in "Note 2". 

The  reader APIs in this patch will return a subset of entities, those written in "Note 1".


The point we discussed was that when we move on to having user level (and queue level) aggregations,
we would need reader APIs to return that data. For example, an API that returns say megabytemillis
(or all MR counters) for a user within a time range, say like last week. These APIs help understand
usage of a user or queue on the cluster. This data is aggregated data and those APIs could
like have similar API format /users/<userid>/entities perhaps. In this case, we could
call the API /usersummary/<userid>/entities. 

As of now, we will proceed with this patch.






was (Author: vrushalic):
[~rohithsharma] [~varun_saxena] and I discussed this in our weekly call today. The discussion
was about the naming of these apis. The consensus was, as of now, we will proceed with this
patch.

We discussed if we should call it something other than /users/<user>/entities/<entityid>/
to indicate that these are entities that are being queried for without knowledge of the yarn
application id. 

At present, these apis will return sub-application entities. For example, a query that an
user "userA" runs on a Tez setup. This user is different from the user, say user "userYARN"
who is running the Tez AM. 

Note 1: 
Entities from only such queries will go to two places in the backend: 
- in the entity table within the context of  an application: {code}   userYARN / cluster/
flow / flowrun id / appid / entity  {code}
- in the sub application table outside the context of an application:   {code} sub app userA
/ cluster / entity  {code}

Note 2: 
In this same example, the Tez AM itself writes some lifecycle events and metrics of it's containers.
These will go only to entity table for user "userYARN". 

The reader APIs in this patch are going to return data that belongs to the context of entities
stored outside of an application, that is, from the sub application table. 

The reader APIs like GET /ws/v2/timeline/clusters/{cluster name}/apps/{app id}/entities/{entity
type}
 or GET /ws/v2/timeline/apps/{app id}/entities/{entity type} will return all entities, that
is, entities written  in "Note 1" as well as written in "Note 2". 

The  reader APIs in this patch will return a subset of entities, those written in "Note 1".


The point we discussed was that when we move on to having user level (and queue level) aggregations,
we would need reader APIs to return that data. For example, an API that returns say megabytemillis
(or all MR counters) for a user within a time range, say like last week. These APIs help understand
usage of a user or queue on the cluster. This data is aggregated data and those APIs could
like have similar API format /users/<userid>/entities perhaps. In this case, we could
call the API /usersummary/<userid>/entities. 

As of now, we will proceed with this patch.





> Reader API for sub application entities
> ---------------------------------------
>
>                 Key: YARN-6861
>                 URL: https://issues.apache.org/jira/browse/YARN-6861
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: YARN-6861-YARN-5355.001.patch, YARN-6861-YARN-5355.002.patch
>
>
> YARN-6733 and YARN-6734 writes data into sub application table. There should be a way
to read those entities.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message