flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5464) MetricQueryService throws NullPointerException on JobManager
Date Mon, 23 Jan 2017 13:48:27 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834512#comment-15834512
] 

ASF GitHub Bot commented on FLINK-5464:
---------------------------------------

Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3128#discussion_r97316605
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/dump/MetricDumpSerialization.java
---
    @@ -64,122 +80,148 @@ private MetricDumpSerialization() {
     		 * @param gauges     gauges to serialize
     		 * @param histograms histograms to serialize
     		 * @return byte array containing the serialized metrics
    -		 * @throws IOException
     		 */
    -		public byte[] serialize(
    +		public MetricSerializationResult serialize(
     			Map<Counter, Tuple2<QueryScopeInfo, String>> counters,
     			Map<Gauge<?>, Tuple2<QueryScopeInfo, String>> gauges,
     			Map<Histogram, Tuple2<QueryScopeInfo, String>> histograms,
    -			Map<Meter, Tuple2<QueryScopeInfo, String>> meters) throws IOException
{
    -				
    -			baos.reset();
    -			dos.writeInt(counters.size());
    -			dos.writeInt(gauges.size());
    -			dos.writeInt(histograms.size());
    -			dos.writeInt(meters.size());
    +			Map<Meter, Tuple2<QueryScopeInfo, String>> meters) {
    +
    +			buffer.clear();
     
    +			int numCounters = 0;
     			for (Map.Entry<Counter, Tuple2<QueryScopeInfo, String>> entry : counters.entrySet())
{
    -				serializeMetricInfo(dos, entry.getValue().f0);
    -				serializeString(dos, entry.getValue().f1);
    -				serializeCounter(dos, entry.getKey());
    +				try {
    +					serializeCounter(buffer, entry.getValue().f0, entry.getValue().f1, entry.getKey());
    +					numCounters++;
    +				} catch (Exception e) {
    +					LOG.warn("Failed to serialize counter.", e);
    +				}
     			}
     
    +			int numGauges = 0;
     			for (Map.Entry<Gauge<?>, Tuple2<QueryScopeInfo, String>> entry :
gauges.entrySet()) {
    -				serializeMetricInfo(dos, entry.getValue().f0);
    -				serializeString(dos, entry.getValue().f1);
    -				serializeGauge(dos, entry.getKey());
    +				try {
    +					serializeGauge(buffer, entry.getValue().f0, entry.getValue().f1, entry.getKey());
    +					numGauges++;
    +				} catch (Exception e) {
    +					LOG.warn("Failed to serialize gauge.", e);
    +				}
     			}
     
    +			int numHistograms = 0;
     			for (Map.Entry<Histogram, Tuple2<QueryScopeInfo, String>> entry : histograms.entrySet())
{
    -				serializeMetricInfo(dos, entry.getValue().f0);
    -				serializeString(dos, entry.getValue().f1);
    -				serializeHistogram(dos, entry.getKey());
    +				try {
    +					serializeHistogram(buffer, entry.getValue().f0, entry.getValue().f1, entry.getKey());
    +					numHistograms++;
    +				} catch (Exception e) {
    +					LOG.warn("Failed to serialize histogram.", e);
    +				}
     			}
     
    +			int numMeters = 0;
     			for (Map.Entry<Meter, Tuple2<QueryScopeInfo, String>> entry : meters.entrySet())
{
    -				serializeMetricInfo(dos, entry.getValue().f0);
    -				serializeString(dos, entry.getValue().f1);
    -				serializeMeter(dos, entry.getKey());
    +				try {
    +					serializeMeter(buffer, entry.getValue().f0, entry.getValue().f1, entry.getKey());
    +					numMeters++;
    +				} catch (Exception e) {
    +					LOG.warn("Failed to serialize meter.", e);
    --- End diff --
    
    Should we decrease the log level to `debug` (here and the other lines)? The user won't
be able to act on the warning and in most cases this should work and we don't risk printing
out many log messages (as happened before a couple of times).


> MetricQueryService throws NullPointerException on JobManager
> ------------------------------------------------------------
>
>                 Key: FLINK-5464
>                 URL: https://issues.apache.org/jira/browse/FLINK-5464
>             Project: Flink
>          Issue Type: Bug
>          Components: Webfrontend
>    Affects Versions: 1.2.0
>            Reporter: Robert Metzger
>            Assignee: Chesnay Schepler
>
> I'm using Flink 699f4b0.
> My JobManager log contains many of these log entries:
> {code}
> 2017-01-11 19:42:05,778 WARN  org.apache.flink.runtime.webmonitor.metrics.MetricFetcher
    - Fetching metrics failed.
> akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/MetricQueryService#-970662317]]
after [10000 ms]
> 	at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)
> 	at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)
> 	at scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)
> 	at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)
> 	at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:474)
> 	at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:425)
> 	at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:429)
> 	at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:381)
> 	at java.lang.Thread.run(Thread.java:745)
> 2017-01-11 19:42:07,765 WARN  org.apache.flink.runtime.metrics.dump.MetricQueryService
     - An exception occurred while processing a message.
> java.lang.NullPointerException
> 	at org.apache.flink.runtime.metrics.dump.MetricDumpSerialization.serializeGauge(MetricDumpSerialization.java:162)
> 	at org.apache.flink.runtime.metrics.dump.MetricDumpSerialization.access$300(MetricDumpSerialization.java:47)
> 	at org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$MetricDumpSerializer.serialize(MetricDumpSerialization.java:90)
> 	at org.apache.flink.runtime.metrics.dump.MetricQueryService.onReceive(MetricQueryService.java:109)
> 	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
> 	at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
> 	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> 	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:220)
> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message