aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <ma...@apache.org>
Subject Re: [PROPOSAL] New RPC `fetchJobUpdates`
Date Fri, 02 Sep 2016 21:52:03 GMT
As I mentioned in Slack, I am ok with the new RPC as long as there is
a use for it elsewhere in the client or UI. Adding a read-only RPC
that isn't going to be called by our traditional integration test
clients sets a fertile ground for bit rot.

I am actually warming up to your original proposal of adding
JobUpdateQuery into the existing getJobUpdateDetails RPC. While it may
be more expensive to pull multiple updates, we don't necessarily risk
much after we migrated to MVStore on the H2 side. There are no table
locks acquired and the only downside would be pulling events along
with what you need. Provided the query is narrowly scoped, that should
deliver acceptable performance.

On Thu, Sep 1, 2016 at 2:24 PM, Zameer Manji <zmanji@apache.org> wrote:
> Hey,
>
> I've noticed a hole in our current API which makes it difficult to write
> external clients and other tooling around fetching the state of updates.
>
> Currently, to fetch updates we are given two RPCs:
> ````
> /** Gets job update summaries. */
> Response getJobUpdateSummaries(1: JobUpdateQuery jobUpdateQuery)
>
> /** Gets job update details. */
> Response getJobUpdateDetails(1: JobUpdateKey key)
>
> ````
>
> The `getJobUpdateSummaries` RPC is not scoped to a single update and
> returns a
> set of `JobUpdateSummary` structs. The struct is defined:
> ````
> /** Summary of the job update including job key, user and current state. */
> struct JobUpdateSummary {
>   /** Unique identifier for the update. */
>   5: JobUpdateKey key
>
>   /** User initiated an update. */
>   3: string user
>
>   /** Current job update state. */
>   4: JobUpdateState state
> }
> ````
>
> The `getJobUpdateDetails` RPC is scoped to a single update and returns the
> following struct:
>
> ````
> struct JobUpdateDetails {
>   /** Update definition. */
>   1: JobUpdate update
>
>   /** History for this update. */
>   2: list<JobUpdateEvent> updateEvents
>
>   /** History for the individual instances updated. */
>   3: list<JobInstanceUpdateEvent> instanceEvents
> }
>
> ````
>
> Maxim mentioned to me yesterday that this RPC is scoped to a single update
> because assembling the `instanceEvents` can be extremely expensive. A query
> that
> could span more than a single update risks taking down the scheduler in a
> large
> cluster.
>
>
> The problem I discovered is that there is no batch API to get the
> inexpensive
> information inside the `JobUpdate` struct. For reference this struct
> contains:
>
> ````
> /** Full definition of the job update. */
> struct JobUpdate {
>   /** Update summary. */
>   1: JobUpdateSummary summary
>
>   /** Update configuration. */
>   2: JobUpdateInstructions instructions
> }
> ````
>
> Consumers are forced to make several `getJobUpdateDetails` calls to get
> multiple
> `JobUpdate` structs. Since the `JobUpdate` struct is not expensive to
> assemble,
> I'm proposing a new RPC that will allow consumers to get several `JobUpdate`
> structs in a single call.
>
> ````
> /** Gets job updates. */
> Response getJobUpdates(1: JobUpdateQuery jobUpdateQuery)
> ````
>
> If there are no objections, I will file tickets and put up a patch to
> implement
> this.
>
> --
> Zameer Manji

Mime
View raw message