Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 45AF7200C36 for ; Fri, 10 Mar 2017 14:40:56 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 443ED160B79; Fri, 10 Mar 2017 13:40:56 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 64326160B69 for ; Fri, 10 Mar 2017 14:40:55 +0100 (CET) Received: (qmail 26630 invoked by uid 500); 10 Mar 2017 13:40:51 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 26613 invoked by uid 99); 10 Mar 2017 13:40:49 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Mar 2017 13:40:49 +0000 Received: from mail-it0-f47.google.com (mail-it0-f47.google.com [209.85.214.47]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 2BA3A1A0193 for ; Fri, 10 Mar 2017 13:40:49 +0000 (UTC) Received: by mail-it0-f47.google.com with SMTP id g138so8013301itb.0 for ; Fri, 10 Mar 2017 05:40:49 -0800 (PST) X-Gm-Message-State: AFeK/H1rgaUkpMN9DKSkgjFMmvT09vHgZ/Ic/ZtuKJMy9vzv2IsF1IDIs9Vz6K9Du+HkevQaOnW5a2/kdWBWzw== X-Received: by 10.36.181.65 with SMTP id j1mr1879426iti.91.1489153248571; Fri, 10 Mar 2017 05:40:48 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.163.2 with HTTP; Fri, 10 Mar 2017 05:40:28 -0800 (PST) In-Reply-To: References: From: Robert Metzger Date: Fri, 10 Mar 2017 14:40:28 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Streaming Exception To: "user@flink.apache.org" Content-Type: multipart/alternative; boundary=f403045d97e865f9ab054a60821d archived-at: Fri, 10 Mar 2017 13:40:56 -0000 --f403045d97e865f9ab054a60821d Content-Type: text/plain; charset=UTF-8 Hi, this error is only logged at WARN level. As Kaibo already said, its not a critical issue. Can you send some more messages from your log. Usually the Jobmanager logs why a taskmanager has failed. And the last few log messages of the failed TM itself are also often helpful. On Fri, Mar 10, 2017 at 10:46 AM, Kaibo Zhou wrote: > I think this is not the root cause of job failure, this task is caused by > other tasks failing. You can check the log of the first failed task. > > 2017-03-10 12:25 GMT+08:00 Govindarajan Srinivasaraghavan < > govindraghvan@gmail.com>: > >> Hi All, >> >> I see the below error after running my streaming job for a while and when >> the load increases. After a while the task manager becomes completely dead >> and the job keeps on restarting. >> >> Also when I checked if there is an back pressure in the UI, it kept on >> saying sampling in progress and no results were displayed. Is there an API >> which can provide the back pressure details? >> >> 2017-03-10 01:40:58,793 WARN org.apache.flink.streaming.ap >> i.operators.AbstractStreamOperator - Error while emitting latency >> marker. >> org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: >> Could not forward element to next operator >> at org.apache.flink.streaming.runtime.tasks.OperatorChain$Chain >> ingOutput.emitLatencyMarker(OperatorChain.java:426) >> at org.apache.flink.streaming.api.operators.AbstractStreamOpera >> tor$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:848) >> at org.apache.flink.streaming.api.operators.StreamSource$Latenc >> yMarksEmitter$1.onProcessingTime(StreamSource.java:152) >> at org.apache.flink.streaming.runtime.tasks.SystemProcessingTim >> eService$RepeatedTriggerTask.run(SystemProcessingTimeService.java:256) >> at java.util.concurrent.Executors$RunnableAdapter.call( >> Executors.java:511) >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java: >> 308) >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >> tureTask.access$301(ScheduledThreadPoolExecutor.java:180) >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >> tureTask.run(ScheduledThreadPoolExecutor.java:294) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >> Executor.java:1142) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >> lExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.RuntimeException >> at org.apache.flink.streaming.runtime.io.RecordWriterOutput.emi >> tLatencyMarker(RecordWriterOutput.java:117) >> at org.apache.flink.streaming.api.operators.AbstractStreamOpera >> tor$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:848) >> at org.apache.flink.streaming.api.operators.AbstractStreamOpera >> tor.reportOrForwardLatencyMarker(AbstractStreamOperator.java:708) >> at org.apache.flink.streaming.api.operators.AbstractStreamOpera >> tor.processLatencyMarker(AbstractStreamOperator.java:690) >> at org.apache.flink.streaming.runtime.tasks.OperatorChain$Chain >> ingOutput.emitLatencyMarker(OperatorChain.java:423) >> ... 10 more >> Caused by: java.lang.InterruptedException >> at java.lang.Object.wait(Native Method) >> at org.apache.flink.runtime.io.network.buffer.LocalBufferPool. >> requestBuffer(LocalBufferPool.java:168) >> at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.r >> equestBufferBlocking(LocalBufferPool.java:138) >> at org.apache.flink.runtime.io.network.api.writer.RecordWriter. >> sendToTarget(RecordWriter.java:132) >> at org.apache.flink.runtime.io.network.api.writer.RecordWriter. >> randomEmit(RecordWriter.java:107) >> at org.apache.flink.streaming.runtime.io.StreamRecordWriter.ran >> domEmit(StreamRecordWriter.java:104) >> at org.apache.flink.streaming.runtime.io.RecordWriterOutput.emi >> tLatencyMarker(RecordWriterOutput.java:114) >> ... 14 more >> >> >> > --f403045d97e865f9ab054a60821d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

this error is only logged at WARN l= evel. As Kaibo already said, its not a critical issue.

=
Can you send some more messages from your log. Usually the Jobmanager = logs why a taskmanager has failed. And the last few log messages of the fai= led TM itself are also often helpful.



On Fri, Mar 1= 0, 2017 at 10:46 AM, Kaibo Zhou <zkb555@gmail.com> wrote:
=
I think this is not the roo= t cause of job failure, this task is caused by other tasks failing. You can= check the log of the first failed task.

20= 17-03-10 12:25 GMT+08:00 Govindarajan Srinivasaraghavan &= lt;govindraghv= an@gmail.com>:
Hi All,

I see the below error after running my stre= aming job for a while and when the load increases. After a while the task m= anager becomes completely dead and the job keeps on restarting.
<= br>
Also when I checked if there is an back pressure in the UI, i= t kept on saying sampling in progress and no results were displayed. Is the= re an API which can provide the back pressure details?

<= /div>
2017-03-10 01:40:58,793 WARN =C2=A0org.apache.flink.streamin= g.api.operators.AbstractStreamOperator =C2=A0- Error while emitti= ng latency marker.
org.apache.flink.streaming.runtime.tasks.= ExceptionInChainedOperatorException: Could not forward element to next= operator
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.streami= ng.runtime.tasks.OperatorChain$ChainingOutput.emitLatencyMarker(O= peratorChain.java:426)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.ap= ache.flink.streaming.api.operators.AbstractStreamOperator$Countin= gOutput.emitLatencyMarker(AbstractStreamOperator.java:848)
<= div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.streaming.api.oper= ators.StreamSource$LatencyMarksEmitter$1.onProcessingTime(StreamS= ource.java:152)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.s= treaming.runtime.tasks.SystemProcessingTimeService$RepeatedTrigge= rTask.run(SystemProcessingTimeService.java:256)
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.Executors$RunnableAdapter= .call(Executors.java:511)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at jav= a.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.ScheduledTh= readPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPo= olExecutor.java:180)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.con= current.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sched= uledThreadPoolExecutor.java:294)
=C2=A0at java.ut= il.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav= a:1142)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.Threa= dPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 at org.apache.flink.streaming.runtime.io.RecordWriterOutput= .emitLatencyMarker(RecordWriterOutput.java:117)
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.streaming.api.operators.Abstr= actStreamOperator$CountingOutput.emitLatencyMarker(AbstractStream= Operator.java:848)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache= .flink.streaming.api.operators.AbstractStreamOperator.reportOrFor= wardLatencyMarker(AbstractStreamOperator.java:708)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.streaming.api.operators.A= bstractStreamOperator.processLatencyMarker(AbstractStreamOperator= .java:690)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.stream= ing.runtime.tasks.OperatorChain$ChainingOutput.emitLatencyMarker(= OperatorChain.java:423)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ... 10 m= ore
Caused by: java.lang.InterruptedException
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Object.wait(Native Method)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.runtime.io.network.buffer.Lo= calBufferPool.requestBuffer(LocalBufferPool.java:168)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.runtime.io.network.buffer.Lo= calBufferPool.requestBufferBlocking(LocalBufferPool.java:138)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.runtime.io.network.ap= i.writer.RecordWriter.sendToTarget(RecordWriter.java:132)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.runtime.io.network.api.wr= iter.RecordWriter.randomEmit(RecordWriter.java:107)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.streaming.runtime.io.Stre= amRecordWriter.randomEmit(StreamRecordWriter.java:104)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.flink.streaming.runtime.io.R= ecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:114)<= /div>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ... 14 more



--f403045d97e865f9ab054a60821d--