spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Trevor McKay (JIRA)" <>
Subject [jira] [Commented] (SPARK-3644) REST API for Spark application info (jobs / stages / tasks / storage info)
Date Tue, 23 Sep 2014 15:59:34 GMT


Trevor McKay commented on SPARK-3644:

Anecdotal notes from a consumer :)  I recently added some simple spark job functions to Sahara
on OpenStack.  I needed "submit", "basic status", and "terminate".  A RESTful API would have
been great!  I ended up using ssh to the spark master with a Python launcher around spark-submit,
pid files, stderr/stdout and os operations to create the functions I wanted.  It works well,
but ...

I think some textual representation of the data on the web UI would have met all the status
needs.  I have simple states corresponding to "running", "completed successfully", "completed
with error", or "killed" based on the pid and result from the launcher script.

An additional question imho is whether to make it read only, or allow submit/cancel operations.
 Spark-submit is pretty easy to use over ssh, but a REST version of spark-submit might be
a nice complement to a status API.

Cancelation was a bit harder, because I wanted the job to run asynchronously from ssh without
an open connection but still be cancellable.  This meant that I had to deal with closing file
descriptors, saving the pid, issuing kill, etc.  A cancel-by-id REST function would be great,
too, if this work can go beyond readonly status.

> REST API for Spark application info (jobs / stages / tasks / storage info)
> --------------------------------------------------------------------------
>                 Key: SPARK-3644
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Web UI
>            Reporter: Josh Rosen
> This JIRA is a forum to draft a design proposal for a REST interface for accessing information
about Spark applications, such as job / stage / task / storage status.
> There have been a number of proposals to serve JSON representations of the information
displayed in Spark's web UI.  Given that we might redesign the pages of the web UI (and possibly
re-implement the UI as a client of a REST API), the API endpoints and their responses should
be independent of what we choose to display on particular web UI pages / layouts.
> Let's start a discussion of what a good REST API would look like from first-principles.
 We can discuss what urls / endpoints expose access to data, how our JSON responses will be
formatted, how fields will be named, how the API will be documented and tested, etc.
> Some links for inspiration:

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message