ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Rakov <ivan.glu...@gmail.com>
Subject Re: IEP-14: Ignite failures handling (Discussion)
Date Tue, 13 Mar 2018 23:13:51 GMT
I just would like to add my +1 for "kill if standalone, stop if 
embedded" default option. My arguments:

1) Regarding "If Ignite hangs - it will likely be impossible to stop":
Unfortunately, it's true that Ignite can hang during stop procedure. 
However, most of failures described under IEP-14 (storage IO exceptions, 
death of critical system worker thread, etc) normally shouldn't turn 
node into "impossible to stop" state. Turning into that state is a bug 
itself. I guess that we shouldn't choose system behavior on the basis of 
known bugs.

2) User might want to handle Ignite node crash before shutting down the 
whole JVM - raise alert, close external resources, etc

3) IEP-14 document has important notes: "More than one Ignite node could 
be started in one JVM process" and "Different nodes in one JVM process 
could belong to different clusters". This is possible only in embedded 
mode. I think, we shouldn't shock user by sudden JVM halt (possibly, 
along with another healthy nodes) if there's a chance of successful node 
stop.

Best Regards,
Ivan Rakov

On 14.03.2018 1:47, Dmitriy Setrakyan wrote:
> Guys, I do not think there is an understanding here. If Ignite hangs - it
> will likely be impossible to stop. So if you are suggesting "stop if
> embedded", you might as well suggest "do nothing if embedded".
>
> I have seen many Ignite deployments, embedded or not, large and small, and
> in all those deployments if Ignite went into a frozen state, killing it was
> the best option. Moreover, it provided the most predictable behavior. I am
> not guessing here, but it seems to me that the rest of the community is
> guessing.
>
> Killing a frozen Ignite node should be a default behavior in all cases,
> embedded or not. Stopping a frozen Ignite node should be a configurable
> option, so a user has an ability to turn off auto-kill behavior. We should
> also have a 3rd option, "stop+kill", so if stopping fails, then the process
> is automatically killed (this is also a good default option).
>
> Personally, I am OK if the default behavior is "kill" or "stop+kill", but
> it should be the same default in all cases. We should stop the practice of
> creating different default behaviors for the same problem. It is confusing
> and hard to document.
>
> D.
>
> On Tue, Mar 13, 2018 at 2:19 PM, Denis Magda <dmagda@apache.org> wrote:
>
>> +1 for "kill if standalone, stop if embedded" behavior. If the practice
>> shows that the node should be killed regardless of the mode, then it will
>> be an easy change. Now we are just guessing, and common sense suggests
>> going for "kill if standalone, stop if embedded" until we get feedback.
>>
>> -
>> Denis
>>
>> On Tue, Mar 13, 2018 at 8:30 AM, Dmitry Pavlov <dpavlov.spb@gmail.com>
>> wrote:
>>
>>> You are suggesting to kill the process, which was not started by Ignite,
>>> are not you?
>>>
>>> More consistently is to stop only those processes that are generated by
>> the
>>> control of Ignite, e.g. from ignite.sh - here it is ok for me.
>>>
>>> If we relese 'kill by default' as part of 2.5, we will end up with 2.6
>>> emergency release to change it back, if one user will face with such
>>> unexpected behaviour.
>>>
>>> вт, 13 мар. 2018 г. в 18:17, Dmitriy Setrakyan <dsetrakyan@apache.org>:
>>>
>>>> Dmitriy,
>>>>
>>>> I think everyone is suggesting that stopping the node will likely be
>>>> impossible if Ignite is frozen. Moreover, it is very likely that all
>>> other
>>>> apps are frozen too.
>>>>
>>>> My comments are below...
>>>>
>>>> On Tue, Mar 13, 2018 at 9:12 AM, Dmitry Pavlov <dpavlov.spb@gmail.com>
>>>> wrote:
>>>>
>>>>> Please consider that user application may use Ignite as optional
>> cache
>>>> for
>>>>> some low-priority feature, but main logic is well functioning without
>>>>> Ingnite. I can say, as Ignite user in the past, that it is quite real
>>>> case.
>>>> I have been a part of this project for a while, but I have never seen
>>>> Ignite used as an optional cache. Usually, Ignite is a mandatory part
>> of
>>>> the application, not optional.
>>>>
>>>>
>>>>> Second real case is using several war files within one application
>>>> server,
>>>>> running different logic. Some apps use Ignite, some applications -
>> not.
>>>>> Killing application server in this case is not an option too.
>>>>>
>>>> Not very likely, but possible. This is not a common use case. Most
>>> commonly
>>>> Ignite would be serving all WAR files with a common data layer.
>>>>
>>>>
>>>>> So default should be stopping all node threads, but not kill the
>>> process.
>>>>> If user is aware process may be killed, it may setup option.
>>>>>
>>>> No, the default should be to kill the process. If user does not like
>> it,
>>>> then it should be possible to change it to stop the node first.
>>>>
>>>>
>>>>> вт, 13 мар. 2018 г. в 15:24, Dmitriy Setrakyan <
>> dsetrakyan@apache.org
>>>> :
>>>>>> On Tue, Mar 13, 2018 at 8:16 AM, Dmitry Pavlov <
>>> dpavlov.spb@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Dmitriy, alternative is "kill if standalone, stop if embedded"
>>>>>>
>>>>>>> User will be still able to set something like
>>>>>>> -DNODE_CRASH_ACTION="kill"
>>>>>>> if ignite.sh is not used and user accepts alternative that whole
>>>>> process
>>>>>>> would be killed if node is crashed.
>>>>>>>
>>>>>>> Default would be 'node stop', but not hang up infinetely.
>>>>>>>
>>>>>> Dmitriy, if Ignite if frozen, you will not be able to stop it. The
>>> only
>>>>>> guaranteed way to "un-freeze" the cluster is to kill the frozen
>> JVM.
>>>>>> On top of that, it is very likely that if you stop the "embedded"
>>>> Ignite,
>>>>>> the user application will not be able to function any way, so
>> killing
>>>> the
>>>>>> node does sound like a better and *safer* option.
>>>>>>
>>>>>> D.
>>>>>>


Mime
View raw message