hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints
Date Thu, 22 Sep 2016 05:19:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15512200#comment-15512200

Sangjin Lee commented on YARN-5585:

I am also catching up on this discussion (sorry it got delayed).

Generally I am in agreement with Varun and Vrushali on possible approaches. I'd like to add
a few more thoughts to refine the idea.

(1) supporting chronological order sorting
I think that even for framework-specific entities (e.g. tez vertices, MR task entities, etc.),
the "sorting" order cannot be completely arbitrary. Because we have a strong design decision
on reflecting recency in the row keys, the natural sorting order should be the *chronological
order*, or strange things would result.

For YARN entities, the id order would satisfy this for the most part (and ditto for MR entities).
If tez can craft the id's such that the lexicographical order is also the chronological order,
that would be by far the simplest solution to the problem. I'm not sure how feasible it is
for tez to add padding etc. to preserve the chronological order in the entity id's. [~rohithsharma],
can we change the id's to order them properly?

If the framework cannot make the id lexicographical order the same as the chronological order,
then we might have to introduce the notion of bytes provided by the framework (and an auxiliary
table) to support this as suggested by Vrushali and Varun. But that would be at the some cost.
All things being equal, I would love not to populate another table on the write path.

Also note that we still need to be able to support single-entity queries in this case (i.e.
queries by entity id). How would we able to support queries by id in this case?

(2) setting the created time field
In timeline service v.2, the strong assumption/requirement is that the created time is set
by the client. It sounds like the current tez code does not set the created time. I think
it should be set. That's the contract we're using. We're not really expecting an empty created
time when we write them.

(3) TimelineEntity.compareTo()
It is a good catch by Rohith. It escaped the review, but it does appear that the id sorting
if created time is empty is the opposite of what it should be. The string should be sorted
by the descending order, but the current code is doing the opposite. This should be fixed.
We can either fix it here or can open a separate subtask to fix it. Either way, we should
fix it.

> [Atsv2] Add a new filter fromId in REST endpoints
> -------------------------------------------------
>                 Key: YARN-5585
>                 URL: https://issues.apache.org/jira/browse/YARN-5585
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>            Priority: Critical
>         Attachments: YARN-5585.v0.patch
> TimelineReader REST API's provides lot of filters to retrieve the applications. Along
with those, it would be good to add new filter i.e fromId so that entities can be retrieved
after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities then REST
call gives first/last 100 entities. How to retrieve next set of 100 entities i.e 101 to 200
OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is no way
to achieve this. 
> So proposal is to have fromId in the filter like *getApps?limit=5&&fromId=app-5*
which gives list of apps from app-6 to app-10. 
> Since ATS is targeting large number of entities storage, it is very common use case to
get next set of entities using fromId rather than querying all the entites. This is very useful
for pagination in web UI.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message