hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints
Date Wed, 07 Sep 2016 14:36:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470803#comment-15470803

Varun Saxena commented on YARN-5585:

The best solution for this will be that keys are stored in sorted order in Timeline service.
But generic entity in ATSv2 terms can be anything so for us to specify a well defined generic
behaviour for every entity under the sun would be impossible. Only applications which are
users of ATSv2 will be sure of what entity ID means for them. For Tez, it maybe DAG or Vertex
ID, for Spark it maybe task ID and so on.

But then as your use case suggests ATSv2 just brushing off requests of users may not be very
good as it might be useful to fetch even generic entities in sorted order for users of ATSv2.

So how about we provide a PUBLIC interface which ATSv2 users like Tez can implement to decide
how to encode and decode a particular entity type so that it is stored in sorted fashion in
ATSv2 ? Say something like an EntityIDConverter interface with encode function (takes a String
and outputs an encoded byte array) and decode function (takes byte array and converts into
its String equivalent).
ATSv2 can have a configuration which contains list of converters against entity types. Something
like {{<entity type>:<interface impl>, <entity type>:<interface impl>...}}
All the collectors and readers can load these implementations if they exist in their classpath.
If implementation does not exist, default behavior specified above can be adopted. Or we can
just carry out a full scan within the scope of entity type.
A DAG ID for instance consists of an AppID and 4 bytes of DAG seq number. So we can write
an encode function which outputs a 16 bytes byte  array with 8 bytes of inverted cluster timestamp
in AppID, 4 bytes of inverted sequence number (in App ID) and 4 bytes of inverted DAG seq
number. This will ensure DAGs' are stored in descendingly sorted fashion. Such implementation
for instance can be provided by Tez.

Will such a solution be acceptable to Tez ?

> [Atsv2] Add a new filter fromId in REST endpoints
> -------------------------------------------------
>                 Key: YARN-5585
>                 URL: https://issues.apache.org/jira/browse/YARN-5585
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: YARN-5585.v0.patch
> TimelineReader REST API's provides lot of filters to retrieve the applications. Along
with those, it would be good to add new filter i.e fromId so that entities can be retrieved
after the fromId. 
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-10. But to retrieve next 5 apps, it is difficult.
> So proposal is to have fromId in the filter like *getApps?limit=5&&fromId=app-5*
which gives list of apps from app-6 to app-10. 
> This is very useful for pagination in web UI.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message