flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Thomson (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-8445) hostname used in metric names for taskmanager and jobmanager are not consistent
Date Wed, 17 Jan 2018 15:24:00 GMT
Chris Thomson created FLINK-8445:
------------------------------------

             Summary: hostname used in metric names for taskmanager and jobmanager are not
consistent
                 Key: FLINK-8445
                 URL: https://issues.apache.org/jira/browse/FLINK-8445
             Project: Flink
          Issue Type: Bug
          Components: Metrics
    Affects Versions: 1.3.1
         Environment: I think that this problem is present for metrics reporting enabled configurations
that include '<host>' as part of the scope for the metrics.  For example, using Graphite
reporting configuration in flink-conf.yaml below:


{code:java}
metrics.scope.jm: flink.<host>.jobmanager
metrics.scope.jm.job: flink.<host>.jobmanager.<job_name>
metrics.scope.tm: flink.<host>.taskmanager
metrics.scope.tm.job: flink.<host>.taskmanager.<job_name>
metrics.scope.task: flink.<host>.taskmanager.<job_name>.<task_name>.<subtask_index>
metrics.scope.operator: flink.<host>.taskmanager.<job_name>.<operator_name>.<subtask_index>

metrics.reporters: graphite
metrics.reporter.graphite.class: org.apache.flink.metrics.graphite.GraphiteReporter
...{code}
            Reporter: Chris Thomson


Enabled Flink metrics reporting using Graphite using system scopes that contain '<host>'
for both the job manager and task manager.  The resulting metrics reported to Graphite use
two different representations for '<host>'.

For *Task Manager metrics* it uses the *short hostname* (without the DNS domain).  This
is a result of logic in org.apache.flink.runtime.taskmanager.TaskManagerLocation constructor
that tries to extract the short hostname from the fully qualified domain name looked up from
InetAddress.getCanonicalHostName().

For *Job Manager metrics* it uses the *fully qualified domain name* (with the DNS domain).
This is a result of there being no logic in org.apache.flink.runtime.jobmanager.JobManagerRunner
or org.apache.flink.runtime.rpc.akka.AkkaRpcService to perform equivalent normalization of
the fully qualified domain name down to the short hostname.

Ideally the '<host>' placeholders in the system scopes for the job manager and task
manager related metrics would be replaced with a consistent value (either the short hostname
or the fully qualified domain name).  Even better if there was a configuration option to
decide which one should be used for metric name generation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message