spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <r...@databricks.com>
Subject Re: Application level progress monitoring and communication
Date Mon, 30 Jun 2014 06:07:03 GMT
This isn't exactly about Spark itself, more about how an application on
YARN/Mesos can communicate with another one.

How about your application launch program just takes in a parameter (or env
variable or command line argument) for the IP address of your client
application, and just send updates? You basically just want to send
messages to report progress. You can do it with a lot of different ways,
such as Akka, custom REST API, Thrift ... I think any of them will do.




On Sun, Jun 29, 2014 at 7:57 PM, Chester Chen <chester@alpinenow.com> wrote:

> Hi Spark dev community:
>
> I have several questions regarding Application and Spark communication
>
> 1) Application Level Progress Monitoring
>
> Currently, our application using in YARN_CLUSTER model running Spark Jobs.
> This works well so far, but we would like to monitoring the application
> level progress ( not spark system level progress).
>
> For example,
> If we are doing Machine Learning Training, I would like to send some
> message back the our application, current status of the training, number of
> iterations etc via API.
>
> We can't use YARN_CLIENT mode for this purpose as we are running the spark
> application in servlet container (tomcat/Jetty). If we run the yarn_client
> mode, we will be limited to one SparkContext per JVM.
>
> So we are considering to leverage Akka messaging, essentially create
> another Actor to send message back to the client application.
> Notice that Spark already has an Akka ActorSystem defined for each
> Executor. All we need to find Actor address (host, port) for the spark
> driver executor.
>
> The trouble is that driver's host and port are not known until later when
> Resource Manager give to the executor node. How to communicate the host,
> port info back to the client application ?
>
> May be there is an Yarn API to obtain this information from Yarn Client.
>
>
> 2) Application and Spark Job communication In YARN Cluster mode.
>
>     There are several use cases we are thinking may require communication
> between the client side application and Spark Running Job.
>
>      One example,
>        * Try to stop a running job -- while job is running, abort the long
> running job in Yarn
>
>       Again, we are think to use Akka Actor to send a STOP job message.
>
>
>
> So here some of  questions:
>
> * Is there any work regarding this area in the community ?
>
> * what do you think the Akka approach ? Alternatives ?
>
> * Is there a way to get Spark's Akka host and port from Yarn Resource
> Manager to Yarn Client ?
>
> Any suggestions welcome
>
> Thanks
> Chester
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message