arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciej Skrzypkowski <m.skrzypkow...@gmx.com>
Subject Re: Arrow C++ API - memory management
Date Mon, 16 Nov 2020 09:56:47 GMT
I've proved that it was my code causing memory leak. Thank again for you
help.

On 10.11.2020 17:01, Maciej Skrzypkowski wrote:
> It seems that the memory leak is caused by other part of my code
> (which I thought to be fine), not related to Arrow. I'll check it more
> and fill issue if there will be need for.
>
> On 10.11.2020 03:10, Wes McKinney wrote:
>> The memory should automatically be freed by any object / shared_ptr /
>> unique_ptr destruction. On Linux we use a background jemalloc thread
>> by default so it may not be freed immediately but it should not be
>> held indefinitely. In any case if you can reproduce the issue
>> consistently we'd be glad to take a look, please open a Jira issue and
>> provide as much information as you can to make it easy for us to
>> reproduce
>>
>> On Mon, Nov 9, 2020 at 9:41 AM Maciej Skrzypkowski
>> <m.skrzypkowski@gmx.com> wrote:
>>> OK, thanks for the answer.
>>>
>>> mArrowTable is "std::shared_ptr<arrow::Table> mArrowTable" so should
>>> be managed properly by the shared pointer. I've narrowed down the
>>> problem to code like this:
>>>
>>> void LoadCSVData::ReadArrowTableFromCSV( const std::string & filePath )
>>> {
>>>      auto tableReader = CreateTableReader( filePath );
>>>      //ReadArrowTableUsingReader( *tableReader );
>>> }
>>>
>>> std::shared_ptr<arrow::csv::TableReader>
>>> LoadCSVData::CreateTableReader( const std::string & filePath )
>>> {
>>>      arrow::MemoryPool* pool = arrow::default_memory_pool();
>>>      auto tableReader = arrow::csv::TableReader::Make( pool,
>>> OpenCSVFile( filePath ),
>>> *PrepareReadOptions(), *PrepareParseOptions(),
>>> *PrepareConvertOptions() );
>>>      if ( !tableReader.ok() )
>>>      {
>>>          throw BadParametersException( std::string( "CSV file reader
>>> error: " ) + tableReader.status().ToString() );
>>>      }
>>>      return *tableReader;
>>> }
>>>
>>> Still memory is getting filled while calling ReadArrowTableFromCSV
>>> many times. Is the arrow's memory pool freed while destruction of
>>> TableReader? Or should I free it explicitly?
>>>
>>>
>>> On 09.11.2020 15:01, Wes McKinney wrote:
>>>
>>> We'd prefer to answer questions on the mailing list or Jira (if
>>> something looks like a bug).
>>>
>>> There isn't enough detail on the SO question to understand what other
>>> things might be going on, but you are never destroying
>>> this->mArrowTable which is holding on to allocated memory. If the
>>> memory use keeps going up through repeated calls to the CSV reader
>>> that sounds like a possible leak, so we would need to see more
>>> details, including about your platform.
>>>
>>> On Mon, Nov 9, 2020 at 2:33 AM Maciej Skrzypkowski
>>> <m.skrzypkowski@gmx.com> wrote:
>>>
>>> Hi All!
>>>
>>> I don't understand memory management in C++ Arrow API. I have some
>>> memory leaks while using it. I've created Stackoverflow question, maybe
>>> someone would answer it:
>>> https://stackoverflow.com/questions/64742588/how-to-manage-memory-while-reading-csv-using-apache-arrow-c-api
>>>
>>> .
>>>
>>> Thanks,
>>> Maciej Skrzypkowski
>>>

Mime
View raw message