hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3901) Populate flow run data in the flow_run & flow activity tables
Date Thu, 17 Sep 2015 17:12:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803240#comment-14803240
] 

Sangjin Lee commented on YARN-3901:
-----------------------------------

Jenkins does kick in but for some reason, it cannot post the result to the JIRA. The result
is the following:

-1 overall

| Vote |           Subsystem |  Runtime   | Comment
============================================================================
|  -1  |          pre-patch  |  16m 3s    | Findbugs (version ) appears to be 
|      |                     |            | broken on YARN-2928.
|  +1  |            @author  |  0m 0s     | The patch does not contain any 
|      |                     |            | @author tags.
|  +1  |     tests included  |  0m 0s     | The patch appears to include 4 new 
|      |                     |            | or modified test files.
|  +1  |              javac  |  8m 26s    | There were no new javac warning 
|      |                     |            | messages.
|  +1  |            javadoc  |  10m 47s   | There were no new javadoc warning 
|      |                     |            | messages.
|  +1  |      release audit  |  0m 24s    | The applied patch does not increase 
|      |                     |            | the total number of release audit
|      |                     |            | warnings.
|  +1  |         checkstyle  |  0m 16s    | There were no new checkstyle 
|      |                     |            | issues.
|  -1  |         whitespace  |  0m 50s    | The patch has 9 line(s) that end in 
|      |                     |            | whitespace. Use git apply
|      |                     |            | --whitespace=fix.
|  +1  |            install  |  1m 39s    | mvn install still works. 
|  +1  |    eclipse:eclipse  |  0m 43s    | The patch built with 
|      |                     |            | eclipse:eclipse.
|  +1  |           findbugs  |  0m 52s    | The patch does not introduce any 
|      |                     |            | new Findbugs (version 3.0.0)
|      |                     |            | warnings.
|  +1  |         yarn tests  |  2m 37s    | Tests passed in 
|      |                     |            | hadoop-yarn-server-timelineservice.
|      |                     |  42m 43s   | 


|| Subsystem || Report/Notes ||
============================================================================
| Patch URL | http://issues.apache.org/jira/secure/attachment/12756422/YARN-3901-YARN-2928.10.patch
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / b1960e0 |
| whitespace | /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/whitespace.txt
|
| hadoop-yarn-server-timelineservice test log | /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
|
| Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9191/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep
3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |

I'll remove the whitespace as I commit it. +1?

> Populate flow run data in the flow_run & flow activity tables
> -------------------------------------------------------------
>
>                 Key: YARN-3901
>                 URL: https://issues.apache.org/jira/browse/YARN-3901
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Vrushali C
>         Attachments: YARN-3901-YARN-2928.1.patch, YARN-3901-YARN-2928.10.patch, YARN-3901-YARN-2928.2.patch,
YARN-3901-YARN-2928.3.patch, YARN-3901-YARN-2928.4.patch, YARN-3901-YARN-2928.5.patch, YARN-3901-YARN-2928.6.patch,
YARN-3901-YARN-2928.7.patch, YARN-3901-YARN-2928.8.patch, YARN-3901-YARN-2928.9.patch
>
>
> As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf
> filing jira to track creation and population of data in the flow run table. 
> Some points that are being  considered:
> - Stores per flow run information aggregated across applications, flow version
> RM’s collector writes to on app creation and app completion
> - Per App collector writes to it for metric updates at a slower frequency than the metric
updates to application table
> primary key: cluster ! user ! flow ! flow run id
> - Only the latest version of flow-level aggregated metrics will be kept, even if the
entity and application level keep a timeseries.
> - The running_apps column will be incremented on app creation, and decremented on app
completion.
> - For min_start_time the RM writer will simply write a value with the tag for the applicationId.
A coprocessor will return the min value of all written values. - 
> - Upon flush and compactions, the min value between all the cells of this column will
be written to the cell without any tag (empty tag) and all the other cells will be discarded.
> - Ditto for the max_end_time, but then the max will be kept.
> - Tags are represented as #type:value. The type can be not set (0), or can indicate running
(1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed
on compaction.
> - The m! values are aggregated (summed) upon read. Only when applications are completed
(indicated by tag type 2) can the values be collapsed.
> - The application ids that have completed and been aggregated into the flow numbers are
retained in a separate column for historical tracking: we don’t want to re-aggregate for
those upon replay
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message