arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antoine Pitrou <anto...@python.org>
Subject Re: Arrow C Data Interface
Date Tue, 20 Oct 2020 08:58:26 GMT

Hi Pasha,

It would be helpful to know in which broader context you're asking.  Are
you trying to do something in particular?

> i can't find any details on how to connect this
> protocol with other libraries/applications.

You use those libraries/applications' dedicated APIs.

Just like in Python, when a library's API says "you can pass any object
defining the buffer protocol for argument XXX", you can do just that.

The C data interface is not an *API*.  It defines a standard for
exchanging data.  How that data is exposed or consumed is up to
third-parties.

Regards

Antoine.




Le 20/10/2020 à 03:46, Wes McKinney a écrit :
> hi Pasha,
> 
> Copying dev@.
> 
> You can see how DuckDB interacts with the pyarrow data structures by
> the C interface here, maybe it's helpful
> 
> https://github.com/cwida/duckdb/blob/master/tools/pythonpkg/duckdb_python.cpp
> 
> We haven't defined a Python API (either C API level or Python API
> level) so that objects can advertise that they support the Arrow C
> interface -- it's a separate issue from the C interface itself (which
> doesn't have anything specifically to do with Python), and I agree it
> would probably be a good idea to have a standard way that we codify
> and document .
> 
> Thanks
> Wes
> 
> On Mon, Oct 19, 2020 at 12:34 PM Pasha Stetsenko <stpasha@gmail.com> wrote:
>>
>> Hi everybody,
>>
>> I've been reading http://arrow.apache.org/docs/format/CDataInterface.html, which
has been
>> "... inspired by the Python buffer protocol", and i can't find any details on how
to connect this
>> protocol with other libraries/applications.
>>
>> Here's what I mean: with the python buffer protocol, i can create a new type and
set its
>> `tp_as_buffer` field to a `PyBufferProcs` structure. This way any other library can
call
>> `PyObject_CheckBuffer()` on my object to check whether or not it supports the buffer
interface,
>> and then `PyObject_GetBuffer()` to use that interface.
>>
>> I could not find the corresponding mechanisms in the Arrow C data interface. For
example, consider the "Exporting a simple int32 array" tutorial in the article above. After
creating
>> `export_int32_type()`, `release_int32_type()`, `export_int32_array()`, `release_int32_array()`
>> -- how do i announce to the world that these functions are available? Conversely,
if i want to
>> talk to an Arrow Table via this interface -- where do i find the endpoints that return
>> `ArrowSchema` and `ArrowArray` structures?
>>
>> (I understand that there is an additional, more complicated API for accessing arrow
objects http://arrow.apache.org/docs/python/extending.html, but this seems to be a completely
different
>> API than what CDataInterface describes).

Mime
View raw message