hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joep Rottinghuis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4063) Populate the flow activity table
Date Fri, 21 Aug 2015 18:51:47 GMT

    [ https://issues.apache.org/jira/browse/YARN-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707256#comment-14707256

Joep Rottinghuis commented on YARN-4063:

Min max should work well.
Wondering how we'd guarantee a flush or compaction even happens around end of day time.

Perhaps better to let AM do a daily snapshot so it can decide how to divvy up counters across
daily boundaries.

Sent from my iPhone

> Populate the flow activity table
> --------------------------------
>                 Key: YARN-4063
>                 URL: https://issues.apache.org/jira/browse/YARN-4063
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Vrushali C
> Need to populate the flow_activity table
> -Stores per day flow run pointers and info
> -Written to by RM’s collector for application lifecycle
> primary key: cluster ! day timestamp ! user ! flow id 
> -For the day timestamp we can take the millis since epoch for the end of the day (24:00h).
> columns include runids, start time, end time, snapshot time
> -This table will also be used to efficiently retrieve the flows that had an activity
in a certain day. That is needed for daily aggregations, but also for several UIs, including
a flow-based UI.

This message was sent by Atlassian JIRA

View raw message