hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
Date Mon, 29 Feb 2016 21:37:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172651#comment-15172651
] 

Vrushali C commented on YARN-4700:
----------------------------------

Hi [~Naganarasimha]

Thanks for the patch. I believe the constructor for FlowActivityRowKey should change to correctly
calculate top of the day timestamp given the input timestamp. That is the reason the unit
test is failing I think, since the FlowActivityRowKey is constructed with FlowActivityRowKey.getRowKey(clusterStop,
appCreatedTime, user, flow). 

Also,I think we can remove the  function FlowActivityRowKey #getRowKey(String clusterId, String
userId, String  * flowName) and only keep the FlowActivityRowKey# getRowKey(String clusterId,
long dayTs, String userId, String flowName) . That way it's easier to clean up the unit tests
as well.

And I think you can change the unit test to use different timestamps (but keep the same semantics,
i.e. min start time should actually be the lowest one etc), that way it will be easier to
refactor the unit test. Let me know if this helps. Right now the unit test checks in the flow
activity table that one entry has been made for all of these 4 application entities so you
can use the timestamps that belong to exactly the same day. Or if you use timestamps belonging
to different days, change the test to look for those many entries.

Another thing is that, it looks like the event timestamp that is being used is the timelineEvents.next().getTimestamp().
It might be more explicit to fetch the exact created (or finished) event from the TimelineEntity
and use the timestamp that belong to either ApplicationMetricsConstants.CREATED_EVENT_TYPE
or ApplicationMetricsConstants.FINISHED_EVENT_TYPE. That way, we are using the accurate event
time to make an entry into the flow activity table. You can use TimelineStorageUtils # getApplicationFinishedTime()
function for getting the timestamp for the FINISHED event. You would have to write a new function
to do a similar thing for fetching CREATED event timestamp (or refactor further and use the
same function to get the right event's timestamp).

Hope this helps.. Let me know..

thanks
Vrushali

> ATS storage has one extra record each time the RM got restarted
> ---------------------------------------------------------------
>
>                 Key: YARN-4700
>                 URL: https://issues.apache.org/jira/browse/YARN-4700
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Li Lu
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-4700-YARN-2928.v1.001.patch, YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one extra record
for each finished application (but still hold in the RM state store) each time the RM got
restarted. It's quite possible that we add the cluster start timestamp into the default cluster
id, thus each time we're creating a new record for one application (cluster id is a part of
the row key). We need to fix this behavior, probably by having a better default cluster id.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message