hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6850) Ensure that supplemented timestamp is stored only for flow run metrics
Date Fri, 21 Jul 2017 02:42:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095696#comment-16095696

Vrushali C commented on YARN-6850:

Thanks [~varun_saxena] for the patch. Yes, I think we want to add in something in the documentation
perhaps as part of this jira? That mentions that when we move from alpha1 to beta, the existing
timeseries metrics may not be retrievable. 

I had a question unrelated to this patch but I am seeing it now. 
Why do we have this check?

bq  if (tsBegin != 0 || tsEnd != Long.MAX_VALUE) {

In case someone wants all versions for this metric, how would they do it without knowing the
boundary? It's not so much of a problem for users querying manually but when scripts call
such queries, some times they put in min as 0 and max as long,max as boundaries in order to
fetch everything. Do we want to not allow this.. (just wondering what the thought process

> Ensure that supplemented timestamp is stored only for flow run metrics
> ----------------------------------------------------------------------
>                 Key: YARN-6850
>                 URL: https://issues.apache.org/jira/browse/YARN-6850
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Varun Saxena
>              Labels: yarn-5355-merge-blocker
>         Attachments: YARN-6850-YARN-5355.01.patch
> In timeline service v2,  ColumnHelper#getPutTimestamp supplements the timestamp and is
called by ColumnHelper#store. This is not conditional and called for every put.
> We need to ensure that the cell timestamps for metrics in entity and application (and
sub application) tables are "correct" timestamps since we will be enabling TTLs for these
> The supplemented timestamp is to be used only in the flow run table by the coprocessor
which intercepts all reads & writes to cells in this table. It looks at the supplemented
timestamp to figure out which app id this particular cell belongs to. This is done in order
to ensure no collision occurs when two apps belonging to same flow run write the same metric
at the same timestamp. 
> Discovered in the discussion in YARN-4455 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message