spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <r...@databricks.com>
Subject Re: graceful shutdown in external data sources
Date Wed, 16 Mar 2016 21:40:28 GMT
Maybe just add a watch dog thread and closed the connection upon some
timeout?

On Wednesday, March 16, 2016, Dan Burkert <dan@cloudera.com> wrote:

> Hi all,
>
> I'm working on the Spark connector for Apache Kudu, and I've run into an
> issue that is a bit beyond my Spark knowledge. The Kudu connector
> internally holds an open connection to the Kudu cluster
> <https://github.com/apache/incubator-kudu/blob/master/java/kudu-spark/src/main/scala/org/kududb/spark/KuduContext.scala#L37>
which
> internally holds a Netty context with non-daemon threads. When using the
> Spark shell with the Kudu connector, exiting the shell via <ctrl>-D causes
> the shell to hang, and a thread dump reveals it's waiting for these
> non-daemon threads.  Registering a JVM shutdown hook to close the Kudu
> client does not do the trick, as it seems that the shutdown hooks are not
> fired on <ctrl>-D.
>
> I see that there is an internal Spark API for handling shutdown
> <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ShutdownHookManager.scala>,
> is there something similar available for cleaning up external data sources?
>
> - Dan
>

Mime
View raw message