arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <emkornfi...@gmail.com>
Subject Re: Cannot create default memory pool
Date Wed, 24 Mar 2021 03:26:19 GMT
What is the source of the record batch?  There was a patch since 3.0 that
fixed some potential memory corruption when reading parquet in certain
scenarios (but from the description it doesn't sound like libparquet is
being used?)

On Tue, Mar 23, 2021 at 8:04 PM Matt Youill <matt.youill@airmettle.com>
wrote:

> So this seems to be caused by the variable in memory_pool.cc:
>
> const util::optional<MemoryPoolBackend> user_selected_backend =
> UserSelectedBackend();
>
> being (or becoming) garbage.
>
> For some reason, after a few Gandiva batch evaluations
> user_selected_backend is no longer "jemalloc" but "system" (probably
> actually just null because "system" is 0) and after a while it isn't valid
> at all and crashes.
>
> There aren't multiple copies of Arrow AFAICT but I do have two apps using
> arrow. Both use libarrow.a, libarrow-glib.a and libgandiva.a... one (that
> I'm not super familiar with) shows the above behavior and the other doesn't.
>
> On 22/3/21 10:27 pm, Matt Youill wrote:
>
> Could be the build creating multiple Arrows I suppose. It's a mixture of
> quite an old Makefile calling cmake to build arrow and arrow c lib.
>
> Will double check.
>
> Thanks, Matt
>
> On Mon., 22 Mar. 2021, 9:35 pm Antoine Pitrou, <antoine@python.org> wrote:
>
>> On Mon, 22 Mar 2021 19:34:19 +1100
>> Matt Youill <matt.youill@airmettle.com> wrote:
>> > Hi,
>> >
>> > Not sure if anyone knows anything about this, but am getting a strange
>> > error when evaluating a record batch with a gandiva filter...
>> >
>> > __GI_raise 0x00007f2b8f01718b
>> > __GI_abort 0x00007f2b8eff6859
>> > arrow::util::ArrowLog::~ArrowLog() 0x000056309fe94c12
>> > arrow::default_memory_pool() 0x000056309fd6fff4
>> > gandiva::Annotator::PrepareEvalBatch(arrow::RecordBatch const&,
>> > std::vector<std::shared_ptr<arrow::ArrayData>,
>> > std::allocator<std::shared_ptr<arrow::ArrayData> > > const&)
>> > 0x000056309facdfce
>> > gandiva::LLVMGenerator::Execute(arrow::RecordBatch const&,
>> > std::vector<std::shared_ptr<arrow::ArrayData>,
>> > std::allocator<std::shared_ptr<arrow::ArrayData> > > const&)
>> > 0x000056309faa66a2
>> > gandiva::Filter::Evaluate(arrow::RecordBatch const&,
>> > std::shared_ptr<gandiva::SelectionVector>) 0x000056309fa9ea1d
>> >
>> >
>> > The error reported is "Internal error: cannot create default memory
>> pool"
>> >
>> > I'm using jemalloc
>> >
>> > Not even really sure how a call to arrow::default_memory_pool() can
>> > fail? This is only occurring in a release build if that helps?
>>
>> This logically should not happen.  How did you compile Arrow and
>> Gandiva?  Do you have two versions of Arrow lying around perhaps?
>>
>>
>>
>>

Mime
View raw message