arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciej Skrzypkowski <m.skrzypkow...@gmx.com>
Subject Re: Arrow C++ API - memory management
Date Tue, 10 Nov 2020 16:01:04 GMT
It seems that the memory leak is caused by other part of my code (which
I thought to be fine), not related to Arrow. I'll check it more and fill
issue if there will be need for.

On 10.11.2020 03:10, Wes McKinney wrote:
> The memory should automatically be freed by any object / shared_ptr /
> unique_ptr destruction. On Linux we use a background jemalloc thread
> by default so it may not be freed immediately but it should not be
> held indefinitely. In any case if you can reproduce the issue
> consistently we'd be glad to take a look, please open a Jira issue and
> provide as much information as you can to make it easy for us to
> reproduce
>
> On Mon, Nov 9, 2020 at 9:41 AM Maciej Skrzypkowski
> <m.skrzypkowski@gmx.com> wrote:
>> OK, thanks for the answer.
>>
>> mArrowTable is "std::shared_ptr<arrow::Table> mArrowTable" so should be managed
properly by the shared pointer. I've narrowed down the problem to code like this:
>>
>> void LoadCSVData::ReadArrowTableFromCSV( const std::string & filePath )
>> {
>>      auto tableReader = CreateTableReader( filePath );
>>      //ReadArrowTableUsingReader( *tableReader );
>> }
>>
>> std::shared_ptr<arrow::csv::TableReader> LoadCSVData::CreateTableReader( const
std::string & filePath )
>> {
>>      arrow::MemoryPool* pool = arrow::default_memory_pool();
>>      auto tableReader = arrow::csv::TableReader::Make( pool, OpenCSVFile( filePath
),
>>                                                        *PrepareReadOptions(), *PrepareParseOptions(),
*PrepareConvertOptions() );
>>      if ( !tableReader.ok() )
>>      {
>>          throw BadParametersException( std::string( "CSV file reader error: " ) +
tableReader.status().ToString() );
>>      }
>>      return *tableReader;
>> }
>>
>> Still memory is getting filled while calling ReadArrowTableFromCSV many times. Is
the arrow's memory pool freed while destruction of TableReader? Or should I free it explicitly?
>>
>>
>> On 09.11.2020 15:01, Wes McKinney wrote:
>>
>> We'd prefer to answer questions on the mailing list or Jira (if
>> something looks like a bug).
>>
>> There isn't enough detail on the SO question to understand what other
>> things might be going on, but you are never destroying
>> this->mArrowTable which is holding on to allocated memory. If the
>> memory use keeps going up through repeated calls to the CSV reader
>> that sounds like a possible leak, so we would need to see more
>> details, including about your platform.
>>
>> On Mon, Nov 9, 2020 at 2:33 AM Maciej Skrzypkowski
>> <m.skrzypkowski@gmx.com> wrote:
>>
>> Hi All!
>>
>> I don't understand memory management in C++ Arrow API. I have some
>> memory leaks while using it. I've created Stackoverflow question, maybe
>> someone would answer it:
>> https://stackoverflow.com/questions/64742588/how-to-manage-memory-while-reading-csv-using-apache-arrow-c-api
>> .
>>
>> Thanks,
>> Maciej Skrzypkowski
>>

Mime
View raw message