flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Checkpoints keep waiting on source locks
Date Wed, 21 Oct 2015 11:59:07 GMT
Hey!

The issue is that checkpoints can only happen in between elements being in
the pipeline. You block the pipeline in the sleep() call.
Since the checkpoint lock is not fair, the few cycles that the source
releases the lock are not enough for the checkpointer to acquire it.

I wonder if this is an artificial corner case, or actually an issue. The
solution is theoretically simple: Use a fair lock, but we would need to
break the data sources API and switch from "synchronized(Object)" to a fair
"java.concurrent.ReentrantLock".

Greetings,
Stephan


On Wed, Oct 21, 2015 at 1:47 PM, Gyula Fóra <gyula.fora@gmail.com> wrote:

> Hey All,
>
> I think there is some serious issue with the checkpoints. Running a simple
> program like this won't complete any checkpoints:
>
> StreamExecutionEnvironment env =
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.setParallelism(2);
> env.enableCheckpointing(5000);
>
> env.generateSequence(1, 100).map(t -> { Thread.sleep(1000); return t; })
> .map(t -> t).print();
> env.execute();
>
> The job will start executing and triggering checkpoints but the the
> triggerCheckpoint method of the StreamTask will be stuck waiting for the
> checkpoint lock. It will never take a snapshot...
>
> Any ideas?
> This happens on any parallelism, and for other sources as well.
>
> Cheers,
> Gyula
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message