flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (FLINK-3660) Measure latency of elements and expose it as a metric
Date Fri, 14 Oct 2016 12:35:20 GMT

     [ https://issues.apache.org/jira/browse/FLINK-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Metzger resolved FLINK-3660.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.2.0

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/a612b996

> Measure latency of elements and expose it as a metric
> -----------------------------------------------------
>
>                 Key: FLINK-3660
>                 URL: https://issues.apache.org/jira/browse/FLINK-3660
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Metrics, Streaming
>            Reporter: Robert Metzger
>            Assignee: Robert Metzger
>             Fix For: 1.2.0, pre-apache
>
>
> It would be nice to expose the end-to-end latency of a streaming job in the webinterface.
> To achieve this, my initial thought was to attach an ingestion-time timestamp at the
sources to each record.
> However, this introduces overhead for a monitoring feature users might not even use (8
bytes for each element + System.currentTimeMilis() on each element).
> Therefore, I suggest to implement this feature by periodically sending special events,
similar to watermarks through the topology. 
> Those {{LatencyMarks}} are emitted at a configurable interval at the sources and forwarded
by the tasks. The sinks will compare the timestamp of the latency marks with their current
system time to determine the latency.
> The latency marks will not add to the latency of a job, but the marks will be delayed
similarly than regular records, so their latency will approximate the record latency.
> Above suggestion expects the clocks on all taskmanagers to be in sync. Otherwise, the
measured latencies would also include the offsets between the taskmanager's clocks.
> In a second step, we can try to mitigate the issue by using the JobManager as a central
timing service. The TaskManagers will periodically query the JM for the current time in order
to determine the offset with their clock.
> This offset would still include the network latency between TM and JM but it would still
lead to reasonably good estimations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message