flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-10150) Inconsistent number of "Records received" / "Records sent"
Date Mon, 03 Sep 2018 08:44:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16601901#comment-16601901
] 

ASF GitHub Bot commented on FLINK-10150:
----------------------------------------

zentol commented on a change in pull request #6599: [FLINK-10150][metrics] Fix OperatorMetricGroup
creation for Batch
URL: https://github.com/apache/flink/pull/6599#discussion_r214614670
 
 

 ##########
 File path: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/TaskMetricGroup.java
 ##########
 @@ -144,15 +144,17 @@ public OperatorMetricGroup addOperator(OperatorID operatorID, String
name) {
 			name = name.substring(0, METRICS_OPERATOR_NAME_MAX_LENGTH);
 		}
 		OperatorMetricGroup operator = new OperatorMetricGroup(this.registry, this, operatorID,
name);
+		// unique OperatorIDs only exist in streaming, so we have to rely on the name for batch
operators
+		final String key = operatorID + name;
 
 		synchronized (this) {
-			OperatorMetricGroup previous = operators.put(operatorID, operator);
+			OperatorMetricGroup previous = operators.put(key, operator);
 
 Review comment:
   I will rename the method but leave the code as is. It was intentionally written that way
so that we only do a single lookup on the happy path. The default implementation of `putIfAbsent`
is just syntactic sugar for separate get/put calls. While the HashMap _implementation_ of
this method is indeed more efficient in this regard this is an implementation detail.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Inconsistent number of "Records received" / "Records sent"
> ----------------------------------------------------------
>
>                 Key: FLINK-10150
>                 URL: https://issues.apache.org/jira/browse/FLINK-10150
>             Project: Flink
>          Issue Type: Bug
>          Components: Metrics, Webfrontend
>    Affects Versions: 1.4.0, 1.5.0, 1.6.0, 1.7.0
>            Reporter: Helmut Zechmann
>            Assignee: Chesnay Schepler
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.4.3, 1.6.1, 1.7.0, 1.5.4
>
>         Attachments: record_counts_flink_1_3.png, record_counts_flink_1_4.png
>
>
> The flink web ui displays an inconsistent number of "Records received" / "Records sent”
in the job overview "Subtasks" view.
> When I run the example wordcount batch job with a small input file on flink 1.3.2 I
get
>  * 3 records sent by the first subtask and
>  * 3 records received by the second subtask
> This is the result I would expect.
>  
> If I run the same job on flink 1.4.0 / 1.5.2 / 1.6.0 I get
>  * 13 records sent by the first subtask and
>  * 3 records received by the second subtask
> In real life jobs the numbers are much more strange.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message