flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: JobManager not reachable
Date Thu, 15 Oct 2015 10:16:18 GMT
Blocking actor calls should not be an issue (even if they are there),
because the heartbeats go between the actor systems, rather than the
actors...

On Thu, Oct 15, 2015 at 12:14 PM, Till Rohrmann <till.rohrmann@gmail.com>
wrote:

> And please set akka.log.lifecycle.events: true to let Akka log also its
> lifecycle events.
> ​
>
> On Thu, Oct 15, 2015 at 12:12 PM, Robert Metzger <rmetzger@apache.org>
> wrote:
>
> > Can you start flink with logging level DEBUG ?
> > Then we can see from the TaskManager logs when the TM became inactive.
> > Maybe an Akka message is causing the actor to block?
> >
> > You can also monitor the GC from the TaskManager view in the web
> interface
> > (for example by looking at the total time spend for GCing).
> >
> > On Thu, Oct 15, 2015 at 12:09 PM, Stephan Ewen <sewen@apache.org> wrote:
> >
> > > Does not quite sound like GC is an issue.
> > >
> > > Hmmm, what else can make the failure detector kick in unexpectedly?
> > >
> > > On Thu, Oct 15, 2015 at 12:05 PM, Till Rohrmann <trohrmann@apache.org>
> > > wrote:
> > >
> > > > To verify wether GC is a problem you can enable logging of memory
> usage
> > > of
> > > > the JVM via taskmanager.debug.memory.startLogThread: true. The
> interval
> > > of
> > > > the logging is configured via taskmanager.debug.memory.logIntervalMs.
> > > > ​
> > > >
> > > > On Thu, Oct 15, 2015 at 12:00 PM, Matthias J. Sax <mjsax@apache.org>
> > > > wrote:
> > > >
> > > > > The problem is reproducible (it happens on each run).
> > > > >
> > > > > I doubt that GC is an issue here (at least from an UDF point of
> > view),
> > > > > because I read the file once and keep a String object for each
> line.
> > > > > This objects are kept to the very end; the UDF does not release
> them
> > > > > until it returns from "run()" method.
> > > > >
> > > > > The file is also small (10K) so main memory consumption should be a
> > > > > problem either.
> > > > >
> > > > > Of course, the UDF emits on a very high rate and I am not sure if
> > Flink
> > > > > creates a lots of short living object (Streaming does not reuse
> > object
> > > > > yet...) such that GC might become an issue here?
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > >
> > > > >
> > > > > On 10/15/2015 11:44 AM, Stephan Ewen wrote:
> > > > > > From what the logs show, the TaskManager does not send pings any
> > more
> > > > > for a
> > > > > > long time and is then considered failed and the tasks running on
> > that
> > > > > > TaskManager are considered failed as well. So far, nothing
> > unusual...
> > > > > >
> > > > > > Question is, why is it considered failed? Is this a reproducible
> > > > problem?
> > > > > > Or a one time occurrence? Does the user code accumulate so many
> > > objects
> > > > > to
> > > > > > put very heavy pressure on the GC?
> > > > > >
> > > > > > Stephan
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 14, 2015 at 10:47 PM, Matthias J. Sax <
> > mjsax@apache.org>
> > > > > wrote:
> > > > > >
> > > > > >> One thing I forgot the add. I also have a Storm-WordCount job
> > (build
> > > > via
> > > > > >> FlinkTopologyBuilder) that uses the same
> > > > > >> "buffer-file-and-emit-over-and-over-again-pattern" in a spout.
> > This
> > > > job
> > > > > >> run just fine and stops regularly after 5 minutes.
> > > > > >>
> > > > > >> -Matthias
> > > > > >>
> > > > > >>
> > > > > >> On 10/14/2015 10:42 PM, Matthias J. Sax wrote:
> > > > > >>> No. See log below.
> > > > > >>>
> > > > > >>> Btw: the job is not cleaned up properly. Some task remain in
> > state
> > > > > >>> "Canceling".
> > > > > >>>
> > > > > >>> The program I execute is "Streaming WordCount" example with my
> > own
> > > > > >>> source function. This custom source (see below), reads a local
> > > > (small)
> > > > > >>> file, bufferes each line in an internal buffer, and emits the
> > > > buffered
> > > > > >>> lines over and over again (for 5 minutes). (A copy of the file
> is
> > > > > >>> located on every worker machine of course.) This is of course
> no
> > > > common
> > > > > >>> patttern; I just do this, because I have only a small file and
> > want
> > > > the
> > > > > >>> job to run longer.
> > > > > >>>
> > > > > >>> As the job is running for multiple minutes until it fails, I
> > assume
> > > > > that
> > > > > >>> this uncommon pattern should actually not cauase any problem. I
> > > just
> > > > > >>> report it, to make sure it is not the problem (hope you can
> > > confirm).
> > > > > >>>
> > > > > >>> -Matthias
> > > > > >>>
> > > > > >>>
> > > > > >>>>      private static final class SimpleFileSource implements
> > > > > >> ParallelSourceFunction<String> {
> > > > > >>>>              private static final long serialVersionUID =
> > > > > >> -5727198990308179551L;
> > > > > >>>>
> > > > > >>>>              volatile boolean running = true;
> > > > > >>>>              @Override
> > > > > >>>>              public void run(
> > > > > >>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.streaming.api.functions.source.SourceFunction.SourceContext<String>
> > > > > >> ctx)
> > > > > >>>>                                              throws Exception
> {
> > > > > >>>>
> > > > > >>>>                      BufferedReader reader = new
> > > BufferedReader(new
> > > > > >> FileReader(
> > > > > >>>>
> "/data/mjsax/hamlet.txt"));
> > > > > >>>>                      ArrayList<String> buffer = new
> > > > > ArrayList<String>();
> > > > > >>>>                      int counter = 0;
> > > > > >>>>                      String line;
> > > > > >>>>
> > > > > >>>>                      long start = System.nanoTime();
> > > > > >>>>                      while (running && (line =
> > reader.readLine())
> > > !=
> > > > > >> null) {
> > > > > >>>>                              buffer.add(line);
> > > > > >>>>                              ctx.collect(line);
> > > > > >>>>                      }
> > > > > >>>>                      reader.close();
> > > > > >>>>
> > > > > >>>>                      while (running && (System.nanoTime() -
> > > start) <
> > > > > 5L
> > > > > >> * 60 * 1000 * 1000 * 1000) {
> > > > > >>>>                              ctx.collect(buffer.get(counter));
> > > > > >>>>                              counter = ++counter %
> > buffer.size();
> > > > > >>>>                      }
> > > > > >>>>              }
> > > > > >>>>
> > > > > >>>>              @Override
> > > > > >>>>              public void cancel() {
> > > > > >>>>                      running = false;
> > > > > >>>>              }
> > > > > >>>>      }
> > > > > >>>
> > > > > >>>
> > > > > >>> JobManager log:
> > > > > >>>
> > > > > >>>> 22:19:16,618 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (2/28)
> > > (29bb077aa0772c0c42a21b782d496b7c)
> > > > > >> switched from DEPLOYING to RUNNING
> > > > > >>>> 22:22:52,082 WARN  akka.remote.RemoteWatcher
> > > > > >>          - Detected unreachable: [akka.tcp://
> > > > flink@192.168.127.32:32994
> > > > > ]
> > > > > >>>> 22:22:52,083 INFO
> > org.apache.flink.runtime.jobmanager.JobManager
> > > > > >>           - Task manager akka.tcp://
> > > > > >> flink@192.168.127.32:32994/user/taskmanager terminated.
> > > > > >>>> 22:22:52,084 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (8/28)
> > (b3fb63bdac8f446e8bbaa015f30abfdc)
> > > > > >> switched from RUNNING to FAILED
> > > > > >>>> 22:22:52,093 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (1/28)
> > (e18d7fb12ea2b5b88dda07e79a6c62ba)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,096 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (2/28)
> > (b5af8ba1031971c07b8655bdf62acdea)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,096 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (3/28)
> > (47d0cbfb8472a50ac005305dfab91202)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,096 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (4/28)
> > (39144b5a2aa58be4667bdc38c50602b5)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,096 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (5/28)
> > (c6dcb3a5d755b162dbe13f32de406a1d)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,097 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (6/28)
> > (09d9649f2e9c83dbbe3d1b253651f136)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,097 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (7/28)
> > (f82fda5a32c0f96acda4185ade9e985f)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,097 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (9/28)
> > (dc9d2905ae42980a25c8870a1d89c056)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,097 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (10/28)
> > (2e9682428b32fca6fca787e1a6b71359)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,097 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (11/28)
> > (bae79dd7f2b9346c7d168db8143c0bd4)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,098 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (12/28)
> > (642c76fc78faafcb023bce9cec041e9d)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,098 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (13/28)
> > (1cf21c3abc8f91f82b5fb6175181baeb)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,098 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (14/28)
> > (97fe45f48434c568652a9f9cd885b2da)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,098 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (15/28)
> > (b8833cdda3fc205865ef611d176b33ea)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,098 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (16/28)
> > (157a107c35ac994503cca577389a4f80)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,099 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (17/28)
> > (509d1dacb4448b741aaf170009a31364)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,099 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (18/28)
> > (d6bce51dd6d00b6b2f7e59ffcc733242)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,099 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (19/28)
> > (dc6931d39b054b761cfed7345d683c29)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,099 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (20/28)
> > (0f288fa6cc5910235e32564c1740670f)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,099 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (21/28)
> > (80b811c2787d34c67941686e3c2be74a)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,100 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (22/28)
> > (c70d69252e7b3ff6a537a2b09180d818)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,100 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (23/28)
> > (29154b2299dd080e041ca8a349d3b418)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,100 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (24/28)
> > (f56827e4079567adb23406c44f765dcf)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,100 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (25/28)
> > (cd6f30e3a69b02367a6ddd097f9fb825)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,100 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (26/28)
> > (c4b3af71e26718513752595a6cbdcee5)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,101 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (27/28)
> > (b152ab3209ca6c68c6f20213ccb48f40)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,101 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (28/28)
> > (efcee09b90ca372bffa4d086ce0b104a)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,101 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (1/28)
> > > (b2189b3d40cb4fa24083ffefd43ffc4f)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,101 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (2/28)
> > > (29bb077aa0772c0c42a21b782d496b7c)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,101 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (3/28)
> > > (2eb046c3d88cb7e6deccb5ac9d3a3704)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,102 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (4/28)
> > > (63f6696e53f08dc420bb51a10a092531)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,102 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (5/28)
> > > (bd9c6fc6d89e64dd3bef9c21065dfc87)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,102 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (6/28)
> > > (b675c869f60552cc04c3842d297a2529)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,102 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (7/28)
> > > (3f56dfd2ff3fe595555be6a372bf7c7b)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,102 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (8/28)
> > > (94d29535cf2559e5fe6dc13b8a5e3f08)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,103 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (9/28)
> > > (5f99a88b2041719d202f20bbc5dd2ca6)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,103 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (10/28)
> > > > (7c1a75ec9794ae9197ac3e64ecbd4ad8)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,103 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (11/28)
> > > > (c110922406d6f58cd34c80abf1ace972)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,103 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (12/28)
> > > > (2f991c806e12dd9715da532f35d3cd83)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,103 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (13/28)
> > > > (4b59d9e7f5102f7a5d1c09e49ffca66e)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,104 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (14/28)
> > > > (6b966d3247685983a73b439666fe315a)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,104 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (15/28)
> > > > (b7f040b44a87db85c41b7f8e1808aecc)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,104 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (16/28)
> > > > (1a42f51a278bef23c497da8d34a28e91)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,104 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (17/28)
> > > > (b3d86978bc1bf04e2cab2700fb6975e0)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,104 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (18/28)
> > > > (a44cb0a419524a0b7e04d476285b03c0)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,104 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (19/28)
> > > > (0d2b87be2ec0eb7961c05a4cf7baf7a7)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,105 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (20/28)
> > > > (ebd6220fc5d98912811c19aa9fe29e63)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,105 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (21/28)
> > > > (6cd95e1bb7c22e310967eab4149fa93d)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,105 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (22/28)
> > > > (ddc9b7f1c122e2d36632c8f8e886d35b)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,105 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (23/28)
> > > > (0e3b8cc17d0169a12142532dd800f966)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,105 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (24/28)
> > > > (4ac383b01d8c5a2d395612202d1f876c)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,106 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (25/28)
> > > > (bf542fee30071b7cfa63bae514af6106)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,106 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (26/28)
> > > > (e60b40069c1c82c44318e053eacfa7bd)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,106 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (27/28)
> > > > (35d1ce78a6f61a75af41c7b6d9c403a9)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,106 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (28/28)
> > > > (de58a786dbd2557dcc7c54bf2710525b)
> > > > > >> switched from RUNNING to CANCELING
> > > > > >>>> 22:22:52,107 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (16/28)
> > > > (1a42f51a278bef23c497da8d34a28e91)
> > > > > >> switched from CANCELING to FAILED
> > > > > >>>> 22:22:52,107 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (28/28)
> > (efcee09b90ca372bffa4d086ce0b104a)
> > > > > >> switched from CANCELING to FAILED
> > > > > >>>> 22:22:52,108 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (15/28)
> > > > (b7f040b44a87db85c41b7f8e1808aecc)
> > > > > >> switched from CANCELING to FAILED
> > > > > >>>> 22:22:52,108 INFO
> > > org.apache.flink.runtime.instance.InstanceManager
> > > > > >>          - Unregistered task manager akka.tcp://
> > > > > >> flink@192.168.127.32:32994/user/taskmanager. Number of
> registered
> > > > task
> > > > > >> managers 19. Number of available slots 228.
> > > > > >>>> 22:22:52,109 INFO
> > org.apache.flink.runtime.jobmanager.JobManager
> > > > > >>           - Status of job 11b9636948e49d96965051902b0aee1f
> > > (Streaming
> > > > > >> WordCount) changed to FAILING.
> > > > > >>>> java.lang.Exception: The slot in which the task was executed
> has
> > > > been
> > > > > >> released. Probably loss of TaskManager
> > > > 8175fe04b5cad804b197fbd1f8e2a599
> > > > > @
> > > > > >> dbis32 - 12 slots - URL: akka.tcp://
> > > > > >> flink@192.168.127.32:32994/user/taskmanager
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:151)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:547)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:119)
> > > > > >>>>         at
> > > > > >>
> > > org.apache.flink.runtime.instance.Instance.markDead(Instance.java:156)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:215)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:565)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > > > >>>>         at
> > > > > >>
> > > >
> > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> > > > > >>>>         at
> > > > > >>
> > > >
> > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> > > > > >>>>         at
> > > > > >>
> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> > > > > >>>>         at
> akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:103)
> > > > > >>>>         at
> > > akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46)
> > > > > >>>>         at
> > > > > akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369)
> > > > > >>>>         at
> > > > > akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501)
> > > > > >>>>         at akka.actor.ActorCell.invoke(ActorCell.scala:486)
> > > > > >>>>         at
> > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> > > > > >>>>         at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> > > > > >>>>         at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> > > > > >>>>         at
> > > > > >>
> > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > > > >>>>         at
> > > > > >>
> > > >
> > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > > > >>>>         at
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > > >>>> 22:22:52,132 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (15/28)
> > (b8833cdda3fc205865ef611d176b33ea)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,132 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (11/28)
> > (bae79dd7f2b9346c7d168db8143c0bd4)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,133 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (9/28)
> > (dc9d2905ae42980a25c8870a1d89c056)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,133 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (18/28)
> > (d6bce51dd6d00b6b2f7e59ffcc733242)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,134 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (6/28)
> > (09d9649f2e9c83dbbe3d1b253651f136)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,135 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (17/28)
> > (509d1dacb4448b741aaf170009a31364)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,135 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (26/28)
> > (c4b3af71e26718513752595a6cbdcee5)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,135 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (5/28)
> > (c6dcb3a5d755b162dbe13f32de406a1d)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,136 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (14/28)
> > (97fe45f48434c568652a9f9cd885b2da)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,136 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (25/28)
> > (cd6f30e3a69b02367a6ddd097f9fb825)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,137 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (13/28)
> > (1cf21c3abc8f91f82b5fb6175181baeb)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,137 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (10/28)
> > (2e9682428b32fca6fca787e1a6b71359)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,137 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (19/28)
> > (dc6931d39b054b761cfed7345d683c29)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,139 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (7/28)
> > (f82fda5a32c0f96acda4185ade9e985f)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,139 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (4/28)
> > (39144b5a2aa58be4667bdc38c50602b5)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,139 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (2/28)
> > (b5af8ba1031971c07b8655bdf62acdea)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,139 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (24/28)
> > (f56827e4079567adb23406c44f765dcf)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,140 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (22/28)
> > (c70d69252e7b3ff6a537a2b09180d818)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,140 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (3/28)
> > (47d0cbfb8472a50ac005305dfab91202)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,140 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (27/28)
> > (b152ab3209ca6c68c6f20213ccb48f40)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,141 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (23/28)
> > (29154b2299dd080e041ca8a349d3b418)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,141 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (20/28)
> > (0f288fa6cc5910235e32564c1740670f)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,142 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (12/28)
> > (642c76fc78faafcb023bce9cec041e9d)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:52,142 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (16/28)
> > (157a107c35ac994503cca577389a4f80)
> > > > > >> switched from CANCELING to CANCELED
> > > > > >>>> 22:22:57,003 INFO  Remoting
> > > > > >>           - Quarantined address [akka.tcp://
> > > > flink@192.168.127.32:32994]
> > > > > >> is still unreachable or has not been restarted. Keeping it
> > > > quarantined.
> > > > > >>>> 22:23:01,078 WARN  akka.remote.RemoteWatcher
> > > > > >>          - Detected unreachable: [akka.tcp://
> > > > flink@192.168.127.62:35242
> > > > > ]
> > > > > >>>> 22:23:01,079 INFO
> > org.apache.flink.runtime.jobmanager.JobManager
> > > > > >>           - Task manager akka.tcp://
> > > > > >> flink@192.168.127.62:35242/user/taskmanager terminated.
> > > > > >>>> 22:23:01,079 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (1/28)
> > > (b2189b3d40cb4fa24083ffefd43ffc4f)
> > > > > >> switched from CANCELING to FAILED
> > > > > >>>> 22:23:01,081 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (21/28)
> > (80b811c2787d34c67941686e3c2be74a)
> > > > > >> switched from CANCELING to FAILED
> > > > > >>>> 22:23:01,082 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > Keyed
> > > > > >> Aggregation -> Sink: Unnamed (2/28)
> > > (29bb077aa0772c0c42a21b782d496b7c)
> > > > > >> switched from CANCELING to FAILED
> > > > > >>>> 22:23:01,083 INFO
> > > > > >> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
> > > > Source:
> > > > > >> Custom Source -> Flat Map (1/28)
> > (e18d7fb12ea2b5b88dda07e79a6c62ba)
> > > > > >> switched from CANCELING to FAILED
> > > > > >>>> 22:23:01,084 INFO
> > > org.apache.flink.runtime.instance.InstanceManager
> > > > > >>          - Unregistered task manager akka.tcp://
> > > > > >> flink@192.168.127.62:35242/user/taskmanager. Number of
> registered
> > > > task
> > > > > >> managers 18. Number of available slots 216.
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> On 10/11/2015 11:54 PM, Stephan Ewen wrote:
> > > > > >>>> Can you see is there is anything unusual in the JobManager
> logs?
> > > > > >>>> Am 11.10.2015 18:56 schrieb "Matthias J. Sax" <
> mjsax@apache.org
> > >:
> > > > > >>>>
> > > > > >>>>> Hi,
> > > > > >>>>>
> > > > > >>>>> I was just playing arround with Flink. After submitting my
> job,
> > > it
> > > > > runs
> > > > > >>>>> for multiple minutes, until I get the following Exception in
> > one
> > > if
> > > > > the
> > > > > >>>>> TaskManager logs and the job fails.
> > > > > >>>>>
> > > > > >>>>> I have no clue what's going on...
> > > > > >>>>>
> > > > > >>>>> -Matthias
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>> 18:43:23,567 WARN  akka.remote.RemoteWatcher
> > > > > >>>>>          - Detected unreachable: [akka.tcp://
> > > > > flink@192.168.127.11:6123
> > > > > >> ]
> > > > > >>>>>> 18:43:23,864 WARN  akka.remote.ReliableDeliverySupervisor
> > > > > >>>>>         - Association with remote system [akka.tcp://
> > > > > >>>>> flink@192.168.127.11:6123] has failed, address is now gated
> > for
> > > > > [5000]
> > > > > >>>>> ms. Reason is: [Disassociated].
> > > > > >>>>>> 18:43:23,866 INFO
> > > > org.apache.flink.runtime.taskmanager.TaskManager
> > > > > >>>>>         - TaskManager akka://flink/user/taskmanager
> disconnects
> > > > from
> > > > > >>>>> JobManager akka.tcp://
> > flink@192.168.127.11:6123/user/jobmanager:
> > > > > >>>>> JobManager is no longer reachable
> > > > > >>>>>> 18:43:23,867 INFO
> > > > org.apache.flink.runtime.taskmanager.TaskManager
> > > > > >>>>>         - Cancelling all computations and discarding all
> cached
> > > > data.
> > > > > >>>>>> 18:43:23,870 INFO  org.apache.flink.runtime.taskmanager.Task
> > > > > >>>>>          - Attempting to fail task externally Keyed
> Aggregation
> > > ->
> > > > > >> Sink:
> > > > > >>>>> Unnamed (1/28)
> > > > > >>>>>> 18:43:23,870 INFO  org.apache.flink.runtime.taskmanager.Task
> > > > > >>>>>          - Keyed Aggregation -> Sink: Unnamed (1/28) switched
> > to
> > > > > FAILED
> > > > > >>>>> with exception.
> > > > > >>>>>> java.lang.Exception: TaskManager
> akka://flink/user/taskmanager
> > > > > >>>>> disconnects from JobManager akka.tcp://
> > > > > >>>>> flink@192.168.127.11:6123/user/jobmanager: JobManager is no
> > > longer
> > > > > >>>>> reachable
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.handleJobManagerDisconnect(TaskManager.scala:826)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > >
> > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > >
> > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> > > > > >>>>>>         at
> > akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:119)
> > > > > >>>>>>         at
> > > > akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46)
> > > > > >>>>>>         at
> > > > > >> akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369)
> > > > > >>>>>>         at
> > > > > >> akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501)
> > > > > >>>>>>         at akka.actor.ActorCell.invoke(ActorCell.scala:486)
> > > > > >>>>>>         at
> > > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> > > > > >>>>>>         at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> > > > > >>>>>>         at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > >
> > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > > >>>>>> 18:43:23,879 INFO  org.apache.flink.runtime.taskmanager.Task
> > > > > >>>>>          - Triggering cancellation of task code Keyed
> > Aggregation
> > > > ->
> > > > > >> Sink:
> > > > > >>>>> Unnamed (1/28) (4b1f979a60c30f84ee60553730a4e99a).
> > > > > >>>>>> 18:43:23,880 INFO  org.apache.flink.runtime.taskmanager.Task
> > > > > >>>>>          - Attempting to fail task externally Source: Custom
> > > Source
> > > > > ->
> > > > > >> Flat
> > > > > >>>>> Map (1/28)
> > > > > >>>>>> 18:43:23,880 INFO  org.apache.flink.runtime.taskmanager.Task
> > > > > >>>>>          - Source: Custom Source -> Flat Map (1/28) switched
> to
> > > > > FAILED
> > > > > >> with
> > > > > >>>>> exception.
> > > > > >>>>>> java.lang.Exception: TaskManager
> akka://flink/user/taskmanager
> > > > > >>>>> disconnects from JobManager akka.tcp://
> > > > > >>>>> flink@192.168.127.11:6123/user/jobmanager: JobManager is no
> > > longer
> > > > > >>>>> reachable
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.handleJobManagerDisconnect(TaskManager.scala:826)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > >
> > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > >
> > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> > > > > >>>>>>         at
> > akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:119)
> > > > > >>>>>>         at
> > > > akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46)
> > > > > >>>>>>         at
> > > > > >> akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369)
> > > > > >>>>>>         at
> > > > > >> akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501)
> > > > > >>>>>>         at akka.actor.ActorCell.invoke(ActorCell.scala:486)
> > > > > >>>>>>         at
> > > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> > > > > >>>>>>         at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> > > > > >>>>>>         at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > >
> > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> > > > > >>>>>>         at
> > > > > >>>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message