arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: [DISCUSS] C-level in-process array protocol
Date Wed, 02 Oct 2019 02:34:44 GMT
I had an e-mail editing snafu so you can ignore the bottom "inline"
portion since it's just a restatement of what is written more clearly
above

On Tue, Oct 1, 2019 at 9:32 PM Wes McKinney <wesmckinn@gmail.com> wrote:
>
> hi Jacques,
>
> I think we've veered off course a bit and maybe we could reframe the discussion.
>
> Goals
> * A "drop-in" header-only C file that projects can use as a
> programming interface either internally only or to expose in-memory
> data structures between C functions at call sites. Ideally little to
> no disassembly/reassembly should be required on either "side" of the
> call site.
> * Simplifying adoption of Arrow for C programmers, or languages based
> around C FFI
>
> Non-goals
> * Expanding the columnar format or creating an alternative canonical
> in-memory representation
> * Creating a new canonical way to serialize metadata
>
> Note that this use case has been on my mind for more than 2 years:
> https://issues.apache.org/jira/browse/ARROW-1058
>
> I think there are a couple of potentially misleading things at play here
>
> 1. The use of the word "protocol". In C, a struct has a well-defined
> binary layout, so a C API is also an ABI. Using C structs to
> communicate data can be considered to be a protocol, but it means
> something different in the context of the "Arrow protocol". I think we
> need to call this a "C API"
>
> 2. The documentation for this in Antoine's PR is in the format/
> directory. It would probably be better to have a "C API" section in
> the documentation.
>
> The header file under discussion and the documentation about it is
> best considered as a "library".
>
> It might be useful at some point to create a C99 implementation of the
> IPC protocol as well using FlatCC with the goal of having a complete
> implementation of the columnar format in C with minimal binary
> footprint. This is analogous to the NanoPB project which is an
> implementation of Protocol Buffers with small code size
>
> https://github.com/nanopb/nanopb
>
> Let me know if this makes more sense.
>
> I think it's important to communicate clearly about this primarily for
> the benefit of the outside world which can confuse easily as we have
> observed over the last few years =)
>
> Wes
>

Mime
View raw message