flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-456) Optional runtime statistics collection
Date Fri, 06 Feb 2015 17:12:34 GMT

     [ https://issues.apache.org/jira/browse/FLINK-456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Metzger updated FLINK-456:
---------------------------------
    Component/s: TaskManager
                 JobManager

> Optional runtime statistics collection
> --------------------------------------
>
>                 Key: FLINK-456
>                 URL: https://issues.apache.org/jira/browse/FLINK-456
>             Project: Flink
>          Issue Type: New Feature
>          Components: JobManager, TaskManager
>            Reporter: Fabian Hueske
>              Labels: github-import
>             Fix For: pre-apache
>
>
> The engine should collect job execution statistics (e.g., via accumulators) such as:
> - total number of input / output records per operator
> - histogram of input/output ratio of UDF calls
> - histogram of number of input records per reduce / cogroup UDF call
> - histogram of number of output records per UDF call
> - histogram of time spend in UDF calls
> - number of local and remote bytes read (not via accumulators)
> - ...
> These stats should be made available to the user after execution (via webfrontend). The
purpose of this feature is to ease performance debugging of parallel jobs (e.g., to detect
data skew).
> It should be possible to deactivate (or activate) the gathering of these statistics.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/456
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, runtime, user satisfaction, 
> Created at: Tue Feb 04 20:32:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message