flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felipe Gutierrez <felipe.o.gutier...@gmail.com>
Subject Setting the operator-id to measure percentile latency over several jobs
Date Thu, 05 Mar 2020 11:45:08 GMT
Hi community,

I am tracking the latency of operators in Flink according to this reference
[1]. When I am using Prometheus+Grafana I can issue a query using
"flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_latency"
and I can check the percentiles of each "operator_id" and each
"operator_subtask_index". Each "operator_subtask_index" means each instance
of the parallel physical operator, doesn't it?

How can I set a fixed ID for the "operator_id" in my code so I can identify
quickly which operator I am measuring? I used "map(new
MyMapUDF()).uid('my-operator-ID')" but it seems that there is a hash
function that converts the string to a hash value. What is the hash
function used so I can identify my operator? I know that I can use the Rest
API [2] and if I name my operator it will have always the same hash when I
restart the job, but I would like to set its name.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#latency-tracking
[2]
https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#rest-api-integration
*-*
*- Felipe Gutierrez*

*- skype: felipe.o.gutierrez*
*- **https://felipeogutierrez.blogspot.com
<https://felipeogutierrez.blogspot.com>* *
<https://felipeogutierrez.blogspot.com>*

Mime
View raw message