hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joep Rottinghuis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3901) Populate flow run data in the flow_run & flow activity tables
Date Wed, 09 Sep 2015 00:35:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735900#comment-14735900

Joep Rottinghuis commented on YARN-3901:

The one remaining issue we have to tackle is when there are two app attempts. The previous
app attempt ends up buffering some writes, and the new app attempt ends up writing a final_value.
Now if the flush happens before the first attempt its write comes in, we no longer have the
unaggregated value for that app_id in order to discard against (the timestamp should have
taken care of this order).
We can deal with this issue in three ways:
1) Ignore (risky and very hard to debug if it ever happens)
2) Keep the final value around until it has aged a certain time. Upside is that the value
is initially kept (for for example 1-2 days?) and then later discarded. Downside is that we
won't collapse values as quickly on flush as we can. The collapse would probably happen when
a compaction happens, possibly only when a major compaction happens. But previous unaggregated
values may have been written to disk anyway, so not sure how much of an issue this really
3) keep a list of the last x app_ids (aggregation compaction dimension values) on the aggregated
flow-level data. What we would then do in the aggregator is to go through all the values as
we currently do. We'd collapse all the values to keep only the latest per flow. Before we
sum an item for the flow, we'd compare if the app_id was in the list of most recent x (10)
apps that were completed and collapsed. 
Pro is that with a lower app completion rate in a flow, we'd be guarded against stale writes
for longer than a fixed time period. We'd still limit the size of extra storage in tags to
a list of x (10?) items.
Downside is that if apps complete in very rapid succession, we would potentially be protected
from stale writes from an app for a shorter period of time. Given that there is a correlation
between an app completion and its previous run, this may not be a huge factor. It's not like
random previous app attempts are launched. This is really to cover the case when a new app
attempt is launched, but the previous writer had some buffered writes that somehow still got

I'm sort of tempted towards 2, since that is the most similar to the existing TTL functionality,
and probably the easiest to code and understand. Simply compact only after a certain time
period has passed.

> Populate flow run data in the flow_run & flow activity tables
> -------------------------------------------------------------
>                 Key: YARN-3901
>                 URL: https://issues.apache.org/jira/browse/YARN-3901
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Vrushali C
>         Attachments: YARN-3901-YARN-2928.1.patch, YARN-3901-YARN-2928.2.patch, YARN-3901-YARN-2928.3.patch,
YARN-3901-YARN-2928.4.patch, YARN-3901-YARN-2928.5.patch
> As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf
> filing jira to track creation and population of data in the flow run table. 
> Some points that are being  considered:
> - Stores per flow run information aggregated across applications, flow version
> RM’s collector writes to on app creation and app completion
> - Per App collector writes to it for metric updates at a slower frequency than the metric
updates to application table
> primary key: cluster ! user ! flow ! flow run id
> - Only the latest version of flow-level aggregated metrics will be kept, even if the
entity and application level keep a timeseries.
> - The running_apps column will be incremented on app creation, and decremented on app
> - For min_start_time the RM writer will simply write a value with the tag for the applicationId.
A coprocessor will return the min value of all written values. - 
> - Upon flush and compactions, the min value between all the cells of this column will
be written to the cell without any tag (empty tag) and all the other cells will be discarded.
> - Ditto for the max_end_time, but then the max will be kept.
> - Tags are represented as #type:value. The type can be not set (0), or can indicate running
(1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed
on compaction.
> - The m! values are aggregated (summed) upon read. Only when applications are completed
(indicated by tag type 2) can the values be collapsed.
> - The application ids that have completed and been aggregated into the flow numbers are
retained in a separate column for historical tracking: we don’t want to re-aggregate for
those upon replay

This message was sent by Atlassian JIRA

View raw message