hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-6850) Ensure that supplemented timestamp is stored only for flow run metrics
Date Fri, 21 Jul 2017 02:43:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095696#comment-16095696
] 

Vrushali C edited comment on YARN-6850 at 7/21/17 2:42 AM:
-----------------------------------------------------------

Thanks [~varun_saxena] for the patch. Yes, I think we want to add in something in the documentation
perhaps as part of this jira? That mentions that when we move from alpha1 to beta, the existing
timeseries metrics may not be retrievable. 

I had a question unrelated to this patch but I am seeing it now. 
Why do we have this check?

{code} if (tsBegin != 0 || tsEnd != Long.MAX_VALUE) { {code} 

In case someone wants all versions for this metric, how would they do it without knowing the
boundary? It's not so much of a problem for users querying manually but when scripts call
such queries, some times they put in min as 0 and max as long,max as boundaries in order to
fetch everything. Do we want to not allow this.. (just wondering what the thought process
was).



was (Author: vrushalic):
Thanks [~varun_saxena] for the patch. Yes, I think we want to add in something in the documentation
perhaps as part of this jira? That mentions that when we move from alpha1 to beta, the existing
timeseries metrics may not be retrievable. 

I had a question unrelated to this patch but I am seeing it now. 
Why do we have this check?

bq  if (tsBegin != 0 || tsEnd != Long.MAX_VALUE) {

In case someone wants all versions for this metric, how would they do it without knowing the
boundary? It's not so much of a problem for users querying manually but when scripts call
such queries, some times they put in min as 0 and max as long,max as boundaries in order to
fetch everything. Do we want to not allow this.. (just wondering what the thought process
was).


> Ensure that supplemented timestamp is stored only for flow run metrics
> ----------------------------------------------------------------------
>
>                 Key: YARN-6850
>                 URL: https://issues.apache.org/jira/browse/YARN-6850
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Varun Saxena
>              Labels: yarn-5355-merge-blocker
>         Attachments: YARN-6850-YARN-5355.01.patch
>
>
> In timeline service v2,  ColumnHelper#getPutTimestamp supplements the timestamp and is
called by ColumnHelper#store. This is not conditional and called for every put.
> We need to ensure that the cell timestamps for metrics in entity and application (and
sub application) tables are "correct" timestamps since we will be enabling TTLs for these
cells. 
> The supplemented timestamp is to be used only in the flow run table by the coprocessor
which intercepts all reads & writes to cells in this table. It looks at the supplemented
timestamp to figure out which app id this particular cell belongs to. This is done in order
to ensure no collision occurs when two apps belonging to same flow run write the same metric
at the same timestamp. 
> Discovered in the discussion in YARN-4455 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message