arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: [DISCUSS] raw pointers and FFI (C-level in-process array protocol)
Date Thu, 03 Oct 2019 16:17:29 GMT
Related: Gandiva invented its own particular way of passing memory
addresses through the JNI boundary rather than using Flatbuffers
messages

https://github.com/apache/arrow/blob/master/cpp/src/gandiva/jni/jni_common.cc#L505

I'm all for language-agnostic in-memory data passing, but there is a
use case for a C API to pass pointers at call sites while avoiding
flattening (disassembly) and unflattening (reassembly) steps.

On Thu, Oct 3, 2019 at 4:34 AM Antoine Pitrou <antoine@python.org> wrote:
>
>
> Hi Jacques,
>
> Le 03/10/2019 à 02:46, Jacques Nadeau a écrit :
> >
> > I think it is reasonable to argue that keeping any ABI (or header/struct
> > pattern) as narrow as possible would allow us to minimize overlap with the
> > existing in-memory specification. In Arrow's case, this could be as simple
> > as a single memory pointer for schema (backed by flatbuffers) and a single
> > memory location for data (that references the record batch header, which in
> > turn provides pointers into the actual arrow data). [...]
> >
> > [...] (For example, in a JVM
> > view of the world, working with a plain struct in java rather than a set of
> > memory pointers against our existing IPC formats would be quite painful and
> > we'd definitely need to create some glue code for users. I worry the same
> > pattern would occur in many other languages.)
>
> I'm trying to understand the point you're making.  Here you say that it
> was difficult for the JVM to deal with raw pointers.  But above you seem
> to argue for a flatbuffers-based serialization containing raw pointers.
>
> Here's another way to frame the question: how do you propose to do
> zero-copy between different languages if not by passing raw pointers to
> the Arrow data?  And if passing raw pointers is acceptable, what is
> wrong with the spec as proposed?
>
>
> As for creating glue code: yes, of course, that would be needed in most
> languages that want to provide this interface (including C++).  You do
> need a C FFI for that.  I'm quite sure it would be possible to implement
> this proposal in pure Python with ctypes / cffi, for example (as a toy
> example, since PyArrow exists :-)).  When writing the spec, I also took
> a look at the Go and Rust FFIs, and they seem good enough to interact
> with it.  I tried to take a look at JNI, but of course I got lost in the
> documentation :-)
>
> If you are worried that people start thinking that this proposal is part
> of the Arrow specification, perhaps we can make it clear that exposing
> this interface is optional for implementations.
>
> Regards
>
> Antoine.

Mime
View raw message