arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: Copying and memory ownership question
Date Wed, 08 Jan 2020 17:29:19 GMT
hi Matt,

> when calling makeArrayWrapper from caller.py is the Array created within the makeArray
function copied when it is converted into a python object?

No, the object is managed by a shared_ptr, so the underlying object is
not copied

> Also, is the memory freed by the python gc and/or the c++ lib in a timely way?

The memory is released as soon as the underlying array is destructed.
For example, in

    std::shared_ptr<arrow::Array> array;
    builder.Finish(&array);

if you allow "array" to go out of scope, the memory buffers will be
released immediately. You can confirm this by looking at the
MemoryPool* you used when creating the array (here you used
arrow::default_memory_pool())

> If there is copying or leaking in the above setup,  what is the correct way to pass arrow
objects created in c++ libraries back to python without copying or leaking

There isn't any copying or leaking in the code you provided -- the
object returned by pyarrow_wrap_array will follow normal Python object
semantics in Cython or Python. As soon as the Python wrapper object is
gc'd the C++ shared_ptr inside is destroyed. If it's the only
shared_ptr referencing the array (which it is in your example) then
the C++ object will be destroyed and the memory released

- Wes

On Wed, Jan 8, 2020 at 6:49 AM Calder, Matthew <mcalder@xbktrading.com> wrote:
>
> Hi,
>
>
>
> I created a minimal cython interface to c++ and I am unsure of whether or not memory
is copied and how it is eventually freed. My files are:
>
>
>
> --- xbk.hpp ---
>
> #pragma once
>
> #include <arrow/api.h>
>
> namespace xbk {
>
>     std::shared_ptr<arrow::Array> makeArray();
>
> }
>
>
>
> --- xbk.cpp ---
>
> #include <vector>
>
> #include "xbk.hpp"
>
> namespace xbk {
>
>     std::shared_ptr<arrow::Array> makeArray()
>
>     {
>
>         std::vector<std::string> v = {"A", "B", "C"};
>
>         arrow::StringBuilder builder;
>
>         builder.AppendValues(v);
>
>         std::shared_ptr<arrow::Array> array;
>
>         builder.Finish(&array);
>
>         return array;
>
>     }
>
> }
>
>
>
> --- xbk.pxd ---
>
> from pyarrow.lib cimport *
>
> cdef extern from "xbk.cpp":
>
>     pass
>
> cdef extern from "xbk.hpp" namespace "xbk":
>
>     cdef shared_ptr[CArray] makeArray()
>
>
>
> --- xbk_arrow.pyx ---
>
> # distutils: language = c++
>
> from xbk cimport makeArray
>
> from pyarrow.lib cimport *
>
>
>
> def makeArrayWrapper():
>
>     a = makeArray()
>
>     return pyarrow_wrap_array(a)
>
>
>
> --- caller.py ---
>
> from xbk_arrow import makeArrayWrapper
>
> a = makeArrayWrapper()
>
> f"{a[0]} {a[1]} {a[2]}"
>
>
>
>
>
> My questions are: when calling makeArrayWrapper from caller.py is the Array created within
the makeArray function copied when it is converted into a python object? Also, is the memory
freed by the python gc and/or the c++ lib in a timely way? If there is copying or leaking
in the above setup,  what is the correct way to pass arrow objects created in c++ libraries
back to python without copying or leaking? I read over https://arrow.apache.org/docs/python/extending.html
but I am still unsure. Thanks for any help,
>
>
>
> Matt
>
>
>
>
> The information contained in this e-mail may be confidential and is intended solely for
the use of the named addressee.
>
> Access, copying or re-use of the e-mail or any information contained therein by any other
person is not authorized.
>
> If you are not the intended recipient please notify us immediately by returning the e-mail
to the originator.
>
> Disclaimer Version MB.US.1

Mime
View raw message