hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters
Date Wed, 21 Dec 2016 04:49:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15766114#comment-15766114
] 

Rohith Sharma K S commented on YARN-5585:
-----------------------------------------

[~varun_saxena]
bq. Is there any need to populate id prefix in TimelineReaderManager#fillUID
Yes, it is required. When entity is retrieved, UID is constructed using entity details. For
encoding, we send context. Assume, we have not provided idPrefix and entityId then we need
to set to context before encoding UID. Otherwise UID will be wrong

bq. Add a javadoc for the new query params.
bq. Javadoc in TimelineReader should be changed. It currently says entities would be sorted
by created time which is no longer true.
I haven not added/modified java doc anywhere in the patch, I will add/modify accrodingly.



bq. As all our query params are in lower case we can name it as "fromid" ?
cool, make sense

bq. TimelineFilterUtils#createSingleColValueFilters doesnt seem like an apt name as it returns
a single filter. Also why wrap a single filter in a filter list ? We can probably call createHBaseSingleColValue
method directly from GenericEntityReader#getResult
Firstly, will change to singular form i.e createHBaseSingleColValueFilter. Return type can
be changed to Filter!! We can have one wrapper for method  createHBaseSingleColValueFilter
that takes input as column. 

bq. In getResult how about setting a PageFilter of 2 in addition to SingleColumnValueFilter
? This will reduce the rows scanned if there are many duplicate rows. Not a typical case though.
make sense, thats right. it improves scanning performance


bq. GenericEntityReader line 458 there is a typo. Should not be idprefixe
will fix it in next patch.

bq. Should we throw an exception for get entities call too if duplicate entity is found ?
I am -0. This would also require to change TimelineEntity equals method comparing idprefix
also. Let wait for other folks opinion. 

[~gtCarrera9]
bq. Though there is no specific rule, let's not put specific author names in the test data?
my bad, will keep old one.

bq. Shall we avoid using those constants? We can set an enum to represent each part of the
tuple list.
make sense to me, Shall we handled this in separate JIRA?

bq. I'm confused by the changes in EntityRowKeyPrefix(String clusterId, String userId, String
flowName, Long flowRunId, String appId, String entityType, Long entityIdPrefix, String entityId).
Why are we changing this method, but do not overload a new one? Some changes to existing callsites
seems irrelevant to the changes here.
IIRC, one constructor is removed addressing [~sjlee0] [comment|https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15634721&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15634721].
Any entityRowKeyPrefix can be achieved using new constructor. Other constructor was not really
required other than some of the test cases using it.

> [Atsv2] Reader side changes for entity prefix and support for pagination via additional
filters
> -----------------------------------------------------------------------------------------------
>
>                 Key: YARN-5585
>                 URL: https://issues.apache.org/jira/browse/YARN-5585
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>            Priority: Critical
>              Labels: yarn-5355-merge-blocker
>         Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, YARN-5585-YARN-5355.0002.patch,
YARN-5585-YARN-5355.0003.patch, YARN-5585-workaround.patch, YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the applications. Along
with those, it would be good to add new filter i.e fromId so that entities can be retrieved
after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities then REST
call gives first/last 100 entities. How to retrieve next set of 100 entities i.e 101 to 200
OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is no way
to achieve this. 
> So proposal is to have fromId in the filter like *getApps?limit=5&&fromId=app-5*
which gives list of apps from app-6 to app-10. 
> Since ATS is targeting large number of entities storage, it is very common use case to
get next set of entities using fromId rather than querying all the entites. This is very useful
for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message