flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From M Singh <mans2si...@yahoo.com>
Subject Re: Stopping a job
Date Tue, 09 Jun 2020 05:36:10 GMT
 Thanks Kostas, Arvid, and Senthil for your help.
    On Monday, June 8, 2020, 12:47:56 PM EDT, Senthil Kumar <senthilku@vmware.com> wrote:
 
 
 #yiv1043440718 #yiv1043440718 -- _filtered {} _filtered {} _filtered {} _filtered {} _filtered
{}#yiv1043440718 #yiv1043440718 p.yiv1043440718MsoNormal, #yiv1043440718 li.yiv1043440718MsoNormal,
#yiv1043440718 div.yiv1043440718MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv1043440718
a:link, #yiv1043440718 span.yiv1043440718MsoHyperlink {color:blue;text-decoration:underline;}#yiv1043440718
a:visited, #yiv1043440718 span.yiv1043440718MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv1043440718
p.yiv1043440718msonormal0, #yiv1043440718 li.yiv1043440718msonormal0, #yiv1043440718 div.yiv1043440718msonormal0
{margin-right:0in;margin-left:0in;font-size:11.0pt;font-family:sans-serif;}#yiv1043440718
span.yiv1043440718EmailStyle19 {font-family:sans-serif;color:windowtext;}#yiv1043440718 .yiv1043440718MsoChpDefault
{font-size:10.0pt;} _filtered {}#yiv1043440718 div.yiv1043440718WordSection1 {}#yiv1043440718

I am just stating this for completeness.
 
  
 
When a job is cancelled, Flink sends an Interrupt signal to the Thread running the Source.run
method
 
  
 
For some reason (unknown to me), this does not happen when a Stop command is issued.
 
  
 
We ran into some minor issues because of said behavior.
 
  
 
From: Kostas Kloudas <kkloudas@gmail.com>
Date: Monday, June 8, 2020 at 2:35 AM
To: Arvid Heise <arvid@ververica.com>
Cc: M Singh <mans2singh@yahoo.com>, User-Flink <user@flink.apache.org>
Subject: Re: Stopping a job
 
  
 
What Arvid said is correct. 
 
The only thing I have to add is that "stop" allows also exactly-once sinks to push out their
buffered data to their final destination (e.g. Filesystem). In other words, it takes into account
side-effects, so it guarantees exactly-once end-to-end, assuming that you are using exactly-once
sources and sinks. Cancel with savepoint on the other hand did not necessarily and committing side-effects
is was following a "best-effort" approach.
 
  
 
For more information you can check [1].
 
  
 
Cheers,
 
Kostas 
 
  
 
[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212
 
  
 
On Mon, Jun 8, 2020 at 10:23 AM Arvid Heise <arvid@ververica.com> wrote:
 

It was before I joined the dev team, so the following are kind of speculative:
 
  
 
The concept of stoppable functions never really took off as it was a bit of a clumsy approach.
There is no fundamental difference between stopping and cancelling on (sub)task level. Indeed
if you look in the twitter source of 1.6 [1], cancel() and stop() are doing the exact same
thing. I'd assume that this is probably true for all sources.
 
  
 
So what is the difference between cancel and stop then? It's more the way on how you terminate
the whole DAG. On cancelling, you cancel() on all tasks more or less simultaneously. If you
want to stop, it's more a fine-grain cancel, where you stop first the sources and then let
the tasks close themselves when all upstream tasks are done. Just before closing the tasks,
you also take a snapshot. Thus, the difference should not be visible in user code but only
in the Flink code itself (task/checkpoint coordinator)
 
  
 
So for your question:
 
1. No, as on task level stop() and cancel() are the same thing on UDF level.
 
2. Yes, stop will be more graceful and creates a snapshot. [2] 
 
3. Not that I am aware of. In the whole flink code base, there are no more (see javadoc).
You could of course check if there are some in Bahir. But it shouldn't really matter. There
is no huge difference between stopping and cancelling if you wait for a checkpoint to finish.

 
4. Okay you answered your second question ;) Yes cancel with savepoint = stop now to make
it easier for new users.
 
  
 
[1] https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190
 
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html
 
  
 
On Sun, Jun 7, 2020 at 1:04 AM M Singh <mans2singh@yahoo.com> wrote:
 

  
 
Hi Arvid:   
 
  
 
Thanks for the links.  
 
  
 
A few questions:
 
  
 
1. Is there any particular interface in 1.9+ that identifies the source as stoppable ?
 
2. Is there any distinction b/w stop and cancel  in 1.9+ ?
 
3. Is there any list of sources which are documented as stoppable besides the one listed in
your SO link ?
 
4. In 1.9+ there is flink stop command and a flink cancel command. (https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop). 
So it appears that flink stop will take a savepoint and the call cancel, and cancel will just
cancel the job (looks like cancel with savepoint is deprecated in 1.10).  
 
  
 
Thanks again for your help.
 
  
 
  
 
  
 
On Saturday, June 6, 2020, 02:18:57 PM EDT, Arvid Heise <arvid@ververica.com> wrote:
 
  
 
  
 
Yes, it seems as if FlinkKinesisConsumer does not implement it.
 
  
 
Here are the links to the respective javadoc [1] and code [2]. Note that in later releases
(1.9+) this interface has been removed. Stop is now implemented through a cancel() on source
level.
 
  
 
In general, I don't think that in a Kinesis to Kinesis use case, stop is needed anyways, since
there is no additional consistency expected over a normal cancel.
 
  
 
[1]https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html
 
[2]https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java
 
  
 
On Sat, Jun 6, 2020 at 8:03 PM M Singh <mans2singh@yahoo.com> wrote:
 

Hi Arvid:
 
  
 
I check the link and it indicates that only Storm SpoutSource, TwitterSource and NifiSource
support stop.   
 
  
 
Does this mean that FlinkKinesisConsumer is not stoppable ?
 

Also, can you please point me to the Stoppable interface mentioned in the link ?  I found
the following but am not sure if TwitterSource implements it :
 
https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java
 
  
 
Thanks
 
  
 
  
 
  
 
  
 
  
 
On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise <arvid@ververica.com> wrote:
 
  
 
  
 
Hi,
 
  
 
could you check if this SO thread [1] helps you already?
 
  
 
[1]https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable
 
  
 
On Thu, Jun 4, 2020 at 7:43 PM M Singh <mans2singh@yahoo.com> wrote:
 

Hi:
 
  
 
I am running a job which consumes data from Kinesis and send data to another Kinesis queue. 
I am using an older version of Flink (1.6), and when I try to stop the job I get an exception 
 
  
 
Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.rest.util.RestClientException:
[Job termination (STOP) failed: This job is not stoppable.]
 
  
 
  
 
I wanted to find out what is a stoppable job and it possible to make a job stoppable if is
reading/writing to kinesis ?
 
  
 
Thanks
 
  
 



-- 
 
Arvid Heise| Senior Java Developer
 

 
  
 
Follow us @VervericaData
 
--
 
JoinFlink Forward - The Apache Flink Conference
 
Stream Processing | Event Driven | Real Time
 
--
 
VervericaGmbH | Invalidenstrasse 115, 10115 Berlin, Germany
 
--
 
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng   
 



-- 
 
Arvid Heise| Senior Java Developer
 

 
  
 
Follow us @VervericaData
 
--
 
JoinFlink Forward - The Apache Flink Conference
 
Stream Processing | Event Driven | Real Time
 
--
 
VervericaGmbH | Invalidenstrasse 115, 10115 Berlin, Germany
 
--
 
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng   
 



-- 
 
Arvid Heise| Senior Java Developer
 

 
  
 
Follow us @VervericaData
 
--
 
JoinFlink Forward - The Apache Flink Conference
 
Stream Processing | Event Driven | Real Time
 
--
 
VervericaGmbH | Invalidenstrasse 115, 10115 Berlin, Germany
 
--
 
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng   
 
  
Mime
View raw message