arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: Arrow C++ API - memory management
Date Tue, 10 Nov 2020 02:10:05 GMT
The memory should automatically be freed by any object / shared_ptr /
unique_ptr destruction. On Linux we use a background jemalloc thread
by default so it may not be freed immediately but it should not be
held indefinitely. In any case if you can reproduce the issue
consistently we'd be glad to take a look, please open a Jira issue and
provide as much information as you can to make it easy for us to
reproduce

On Mon, Nov 9, 2020 at 9:41 AM Maciej Skrzypkowski
<m.skrzypkowski@gmx.com> wrote:
>
> OK, thanks for the answer.
>
> mArrowTable is "std::shared_ptr<arrow::Table> mArrowTable" so should be managed
properly by the shared pointer. I've narrowed down the problem to code like this:
>
> void LoadCSVData::ReadArrowTableFromCSV( const std::string & filePath )
> {
>     auto tableReader = CreateTableReader( filePath );
>     //ReadArrowTableUsingReader( *tableReader );
> }
>
> std::shared_ptr<arrow::csv::TableReader> LoadCSVData::CreateTableReader( const
std::string & filePath )
> {
>     arrow::MemoryPool* pool = arrow::default_memory_pool();
>     auto tableReader = arrow::csv::TableReader::Make( pool, OpenCSVFile( filePath ),
>                                                       *PrepareReadOptions(), *PrepareParseOptions(),
*PrepareConvertOptions() );
>     if ( !tableReader.ok() )
>     {
>         throw BadParametersException( std::string( "CSV file reader error: " ) + tableReader.status().ToString()
);
>     }
>     return *tableReader;
> }
>
> Still memory is getting filled while calling ReadArrowTableFromCSV many times. Is the
arrow's memory pool freed while destruction of TableReader? Or should I free it explicitly?
>
>
> On 09.11.2020 15:01, Wes McKinney wrote:
>
> We'd prefer to answer questions on the mailing list or Jira (if
> something looks like a bug).
>
> There isn't enough detail on the SO question to understand what other
> things might be going on, but you are never destroying
> this->mArrowTable which is holding on to allocated memory. If the
> memory use keeps going up through repeated calls to the CSV reader
> that sounds like a possible leak, so we would need to see more
> details, including about your platform.
>
> On Mon, Nov 9, 2020 at 2:33 AM Maciej Skrzypkowski
> <m.skrzypkowski@gmx.com> wrote:
>
> Hi All!
>
> I don't understand memory management in C++ Arrow API. I have some
> memory leaks while using it. I've created Stackoverflow question, maybe
> someone would answer it:
> https://stackoverflow.com/questions/64742588/how-to-manage-memory-while-reading-csv-using-apache-arrow-c-api
> .
>
> Thanks,
> Maciej Skrzypkowski
>

Mime
View raw message