samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thunder Stumpges <thunder.stump...@gmail.com>
Subject Re: Problem: upgrade 1.2 to 1.3 caused loss of clean shutdown on SIGTERM
Date Wed, 08 Jan 2020 15:05:43 GMT
Thanks Abhishek,

I believe you are correct; removing the shutdown hook from inside the
container was the problem. I took the shutdown hook code removed from
SamzaContainer in your commit #83e152904ef5 and pulled it out to our App
Runner, calling LocalApplicationRunner.kill() and then
LocalApplicationRunner.waitForFinish(timeout) and I think that has restored
all of the shutdown sequencing.

Where do you think this belongs (in the scope of SAMZA-2426 that you
created)? Maybe in LocalApplicationRunner itself? You said you already took
care of shutdown sequence in cluster mode, yes? I'm open to helping on this
one, just let me know.

thanks,
Thunder



On Tue, Jan 7, 2020 at 7:12 PM Abhishek S <abkshvn@gmail.com> wrote:

> The rational behind moving the shutdown handler out of SamzaContainer was
> to let standalone jobs (which are usually part of other online
> applications) maintain their own shutdown hooks.
> This prevents Samza hooks from running in parallel or causing a deadlock
> with the shutdown hooks of parent application that uses Samza in standalone
> mode (as a library).
>
> That being said, I agree that Containers should attempt graceful shutdown
> and wait at-most "task.shutdown.ms" on SIGTERM.
> I created SAMZA-2426 to investigate the issue further and track the work
> required.
>
> Abhishek
>
>
>
>
>
>
> On Tue, Jan 7, 2020 at 1:55 PM Brett Konold <bkonold@linkedin.com.invalid>
> wrote:
>
> > Thunder,
> >
> > How were you able to determine that the shutdown hooks are not being
> > called?
> >
> > If you're able to share any of your shutdown logs from before and after
> > your 1.3 upgrade, that would be helpful while I try to reproduce this
> issue
> > myself.
> >
> > Brett
> > ________________________________
> > From: Brett Konold <bkonold@linkedin.com>
> > Sent: Tuesday, January 7, 2020 1:25 PM
> > To: dev@samza.apache.org <dev@samza.apache.org>
> > Subject: Re: Problem: upgrade 1.2 to 1.3 caused loss of clean shutdown on
> > SIGTERM
> >
> > Hey Thunder,
> >
> > Thanks for reporting. Taking a look into this and will get back to you
> > when I have something.
> >
> > Brett
> > ________________________________
> > From: Thunder Stumpges <thunder.stumpges@gmail.com>
> > Sent: Monday, January 6, 2020 6:54 PM
> > To: dev@samza.apache.org <dev@samza.apache.org>
> > Subject: Problem: upgrade 1.2 to 1.3 caused loss of clean shutdown on
> > SIGTERM
> >
> > We are attempting to upgrade from samza 1.2 to 1.3 in hopes of fixing
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSAMZA-2198&amp;data=02%7C01%7Cbkonold%40linkedin.com%7C3d5e0f0bf97e4cc633de08d7931d06ad%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637139625187723288&amp;sdata=uwKMfKxQmS9uWS62Hew814P%2Fja7FpaViNTVzYrp03Ok%3D&amp;reserved=0
> > where there was a deadlock
> > in the shutdown code which prevented completing a clean shutdown.
> >
> > After the upgrade, it appears like NONE of the shutdown hooks / code are
> > being called and it just immediately shuts down.
> >
> > We are running stand alone Low Level Tasks with LocalApplicationRunner /
> > ZkJobCoordinator in Docker / K8s.
> >
> > When killing from docker for testing, we use docker kill -s SIGTERM
> > <container> to send SIGTERM instead of SIGKILL. This was working in samza
> > 1.2 (other than the deadlock from the above issue).
> >
> > Any ideas what changed?
> >
> > Thanks,
> > Thunder
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message