flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4840) Measure latency of record processing and expose it as a metric
Date Fri, 04 Nov 2016 12:30:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636212#comment-15636212

ASF GitHub Bot commented on FLINK-4840:

Github user StephanEwen commented on the issue:

    I think we need a different way to solve this.
    This pull request adds a very high overhead to the processing of each record:
      - two calls to `System.nanoTime()`
      - Maintining a Dropwizard Histogram
    Without having benchmarked this, I would expect this to drop the performance for typical
operations like filters or lightweight map functions by a large degree.
    Flink is building a streaming runtime that is performance competitive with a batch runtime,
so the base runtime overhead per record needs to be minimal.
    All metrics so far have been designed with that paradigm in mind: Metrics may not add
any cost to the processing.
      - Metrics are gathered by asynchronous threads
      - The core uses only non-synchronized counters and gauges because they come quasi for
      - We consciously decided to not use in the data paths any metric type that has the overhead
of creating objects of maintaining a data structure.
    I would suggest to first have a design discussion about whether we want to measure this
and how we can do it for free.
    For example, have a look at the "end to end" latency measurements #2386 via latency markers,
for an idea how to measure with minimal impact on the data processing.

> Measure latency of record processing and expose it as a metric
> --------------------------------------------------------------
>                 Key: FLINK-4840
>                 URL: https://issues.apache.org/jira/browse/FLINK-4840
>             Project: Flink
>          Issue Type: Improvement
>          Components: Metrics
>            Reporter: zhuhaifeng
>            Assignee: zhuhaifeng
>            Priority: Minor
>             Fix For: 1.2.0
> We should expose the following Metrics on the TaskIOMetricGroup:
> 1. recordProcessLatency(ms): Histogram measuring the processing time per record of a
task. It is the processing time of chain if a chained task.  
> 2. recordProcTimeProportion(ms): Meter measuring the proportion of record processing
time for infor whether the main cost

This message was sent by Atlassian JIRA

View raw message