flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <mj...@informatik.hu-berlin.de>
Subject Re: [DISCUSS] Canceling Streaming Jobs
Date Wed, 27 May 2015 11:57:14 GMT
Stephan, not sure what you mean by this exactly... But I guess, this a
an "add-on" that can be done later. Seems to be related to
https://issues.apache.org/jira/browse/FLINK-1929

I will open a JIRA for the new "terminate" message and assign it to myself.

-Matthias


On 05/27/2015 12:36 PM, Stephan Ewen wrote:
> +1 for the second option.
> 
> How about we allow to pass a flag that indicates whether a checkpoint
> should be taken together with the canceling?
> 
> 
> On Wed, May 27, 2015 at 12:27 PM, Aljoscha Krettek <aljoscha@apache.org>
> wrote:
> 
>> I would also prefer the second option. The first is rather a hack but not
>> an option. :D
>> On May 27, 2015 9:14 AM, "Márton Balassi" <balassi.marton@gmail.com>
>> wrote:
>>
>>> +1 for the second option:
>>>
>>> It would also provide possibility to properly commit a state checkpoint
>>> after the terminate message was triggered. In some cases this can be a
>>> desirable behaviour.
>>>
>>> On Wed, May 27, 2015 at 8:46 AM, Gyula Fóra <gyfora@apache.org> wrote:
>>>
>>>> Hey,
>>>>
>>>> I would also strongly prefer the second option, users need to have the
>>>> option to force cancel a program in case of something unwanted
>> behaviour.
>>>>
>>>> Cheers,
>>>> Gyula
>>>>
>>>> Matthias J. Sax <mjsax@informatik.hu-berlin.de> ezt írta (időpont:
>> 2015.
>>>> máj. 27., Sze, 1:20):
>>>>
>>>>> Hi,
>>>>>
>>>>> currently, the only way to stop a streaming job is to "cancel" the
>> job,
>>>>> This has multiple disadvantage:
>>>>>  1) a "clean" stopping is not possible (see
>>>>> https://issues.apache.org/jira/browse/FLINK-1929 -- I think a clean
>>> stop
>>>>> is a pre-requirement for FLINK-1929) and
>>>>>  2) as a minor issue, all canceled jobs are listed as canceled in the
>>>>> history (what is somewhat confusing for the user -- at least it was
>> for
>>>>> me when I started to work with Flink Streaming).
>>>>>
>>>>> This issue was raised a few times already, however, no final
>> conclusion
>>>>> was there (if I remember correctly). I could not find a JIRA for it
>>>> either.
>>>>>
>>>>> From my understanding of the system, there would be two ways to
>>>>> implement a nice way for stopping streaming jobs:
>>>>>
>>>>>   1) "Task"s can be distinguished between "batch" and "streaming"
>>>>>      -> canceling a batch jobs works as always
>>>>>      -> canceling a streaming job only send a "canceling" signal to
>> the
>>>>> sources, and waits until the job finishes (ie, sources stop emitting
>>>>> data and finish regularly, triggering the finishing of all
>> operators).
>>>>> For this case, streaming jobs are stopped in a "clean way" (as is the
>>>>> input would have be finite) and the job will be listed as "finished"
>> in
>>>>> the history regularly.
>>>>>
>>>>>   This approach has the advantage, that it should be simpler to
>>>>> implement. However, the disadvantages are (1) a "hard canceling" of
>>> jobs
>>>>> is not possible any more, and (2) Flink must be able to distinguishes
>>>>> batch and streaming jobs (I don't think Flink runtime can distinguish
>>>>> both right now?)
>>>>>
>>>>>   2) A new message "terminate" (or similar) is introduced, that can
>>> only
>>>>> be used for streaming jobs (would be ignored for batch jobs) that
>> stops
>>>>> the sources and waits until the job finishes regularly.
>>>>>
>>>>>   This approach has the advantage, that current system behavior is
>>>>> preserved (it only adds a few feature). The disadvantage is, that all
>>>>> clients need to be touched and it must be clear to the user, that
>>>>> "terminate" does not work for streaming jobs. If an error/warning
>>> should
>>>>> be raised if a user tries to "terminate" a batch job, Flink must be
>>> able
>>>>> to distinguish between batch and streaming jobs, too.  As an
>>>>> alternative, "terminate" on batch jobs could be interpreted as
>>> "cancel",
>>>>> too.
>>>>>
>>>>>
>>>>> I personally think, that the second approach is better. Please give
>>>>> feedback. If we can get to a conclusion how to implement it, I would
>>>>> like to work on it.
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>>
>>>>
>>>
>>
> 


Mime
View raw message