arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Youill <matt.you...@airmettle.com>
Subject Re: Cannot create default memory pool
Date Wed, 24 Mar 2021 03:29:00 GMT
No CSV in these instances.

On Wed., 24 Mar. 2021, 2:26 pm Micah Kornfield, <emkornfield@gmail.com>
wrote:

> What is the source of the record batch?  There was a patch since 3.0 that
> fixed some potential memory corruption when reading parquet in certain
> scenarios (but from the description it doesn't sound like libparquet is
> being used?)
>
> On Tue, Mar 23, 2021 at 8:04 PM Matt Youill <matt.youill@airmettle.com>
> wrote:
>
>> So this seems to be caused by the variable in memory_pool.cc:
>>
>> const util::optional<MemoryPoolBackend> user_selected_backend =
>> UserSelectedBackend();
>>
>> being (or becoming) garbage.
>>
>> For some reason, after a few Gandiva batch evaluations
>> user_selected_backend is no longer "jemalloc" but "system" (probably
>> actually just null because "system" is 0) and after a while it isn't valid
>> at all and crashes.
>>
>> There aren't multiple copies of Arrow AFAICT but I do have two apps using
>> arrow. Both use libarrow.a, libarrow-glib.a and libgandiva.a... one (that
>> I'm not super familiar with) shows the above behavior and the other doesn't.
>>
>> On 22/3/21 10:27 pm, Matt Youill wrote:
>>
>> Could be the build creating multiple Arrows I suppose. It's a mixture of
>> quite an old Makefile calling cmake to build arrow and arrow c lib.
>>
>> Will double check.
>>
>> Thanks, Matt
>>
>> On Mon., 22 Mar. 2021, 9:35 pm Antoine Pitrou, <antoine@python.org>
>> wrote:
>>
>>> On Mon, 22 Mar 2021 19:34:19 +1100
>>> Matt Youill <matt.youill@airmettle.com> wrote:
>>> > Hi,
>>> >
>>> > Not sure if anyone knows anything about this, but am getting a strange
>>> > error when evaluating a record batch with a gandiva filter...
>>> >
>>> > __GI_raise 0x00007f2b8f01718b
>>> > __GI_abort 0x00007f2b8eff6859
>>> > arrow::util::ArrowLog::~ArrowLog() 0x000056309fe94c12
>>> > arrow::default_memory_pool() 0x000056309fd6fff4
>>> > gandiva::Annotator::PrepareEvalBatch(arrow::RecordBatch const&,
>>> > std::vector<std::shared_ptr<arrow::ArrayData>,
>>> > std::allocator<std::shared_ptr<arrow::ArrayData> > > const&)
>>> > 0x000056309facdfce
>>> > gandiva::LLVMGenerator::Execute(arrow::RecordBatch const&,
>>> > std::vector<std::shared_ptr<arrow::ArrayData>,
>>> > std::allocator<std::shared_ptr<arrow::ArrayData> > > const&)
>>> > 0x000056309faa66a2
>>> > gandiva::Filter::Evaluate(arrow::RecordBatch const&,
>>> > std::shared_ptr<gandiva::SelectionVector>) 0x000056309fa9ea1d
>>> >
>>> >
>>> > The error reported is "Internal error: cannot create default memory
>>> pool"
>>> >
>>> > I'm using jemalloc
>>> >
>>> > Not even really sure how a call to arrow::default_memory_pool() can
>>> > fail? This is only occurring in a release build if that helps?
>>>
>>> This logically should not happen.  How did you compile Arrow and
>>> Gandiva?  Do you have two versions of Arrow lying around perhaps?
>>>
>>>
>>>
>>>

Mime
View raw message