arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: Arrow Flight protocol/API questions
Date Sun, 03 Mar 2019 23:22:20 GMT
On Tue, Feb 12, 2019 at 2:44 PM David Ming Li <David.M.Li@twosigma.com> wrote:
>
> Hi all,
>
>
>
> We've been evaluating Flight for our use, and we're wondering if the protocol is still
open to extensions, as having a few application-defined metadata fields would help our use
cases a lot.
>
>
>
> (Apologies if this is a repost - was having issue with the spam filter.)
>
>
>
> Specifically, in DoGet, having a metadata binary blob in the server->client messages
would help implement resumable requests, especially as we have non-monotonically-indexed data
streams. This would also help us reuse server-side state if we do have to resume a stream.
>
>
>
> In DoPut, we think making this call bidirectional would be useful to support application-level
ACKs, again to implement resumable uploads. The server would thus have the option to send
back an application-defined binary blob at any point during an upload. This is less important,
as you could imagine starting a plain gRPC server-streaming call alongside the Flight DoPut
call to do the same. But as you can't bind a gRPC and Flight service on the same port/channel,
this is somewhat inconvenient.
>

Both the DoGet and DoPut changes seem reasonable to me.

>
>
> That leads me to the API-level niggles we have; it would be nice to be able to bind gRPC
services alongside a Flight service, and conversely be able to reuse a gRPC channel across
gRPC and Flight clients, though breaking the hiding of gRPC isn't desirable.
>

At least in C++ it might make sense to have some segregated APIs that
permit binding to an existing gRPC service, or interacting with a gRPC
service that is a composite of the Flight service along with some
additional RPC methods. As long as it is possible to use the Flight
API for simple clients and servers without requiring the grpc
libraries and headers on a host system (so grpc does not have to be
required as a transitive package dependency)

best
Wes

>
>
> Meanwhile, it would be nice to wrap the gRPC server 'awaitTermination' methods, so that
we don't have to busy-wait ourselves (as in Java) or have the option to not busy-wait taken
away from us (as in C++). In particular, when investigating Python bindings to C++ [0], the
fact that FlightServerBase::Run also calls grpc::Server::Wait for you means that Ctrl-C no
longer works in Python.
>
>
>
> Does what we're trying to accomplish make sense? Are there better ways to achieve resumable
uploads/downloads in the current protocol?
>
>
>
> [0]: https://github.com/apache/arrow/pull/3566
>
>
>
> Thanks,
>
> David
>

Mime
View raw message