From user-return-1113-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Wed Mar 24 03:26:36 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id B00241804BB for ; Wed, 24 Mar 2021 04:26:36 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id 0A99B642C8 for ; Wed, 24 Mar 2021 03:26:34 +0000 (UTC) Received: (qmail 14201 invoked by uid 500); 24 Mar 2021 03:26:33 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 14190 invoked by uid 99); 24 Mar 2021 03:26:32 -0000 Received: from spamproc1-he-de.apache.org (HELO spamproc1-he-de.apache.org) (116.203.196.100) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Mar 2021 03:26:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamproc1-he-de.apache.org (ASF Mail Server at spamproc1-he-de.apache.org) with ESMTP id 27AE41FF446 for ; Wed, 24 Mar 2021 03:26:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamproc1-he-de.apache.org X-Spam-Flag: NO X-Spam-Score: -0.002 X-Spam-Level: X-Spam-Status: No, score=-0.002 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamproc1-he-de.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([116.203.227.195]) by localhost (spamproc1-he-de.apache.org [116.203.196.100]) (amavisd-new, port 10024) with ESMTP id 7e4Zk-4Ti-iZ for ; Wed, 24 Mar 2021 03:26:31 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.41; helo=mail-ed1-f41.google.com; envelope-from=emkornfield@gmail.com; receiver= Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 13E9BBCD4E for ; Wed, 24 Mar 2021 03:26:31 +0000 (UTC) Received: by mail-ed1-f41.google.com with SMTP id b16so25975979eds.7 for ; Tue, 23 Mar 2021 20:26:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:reply-to:from:date:message-id :subject:to; bh=qo/TU1ejSz6+TpiYNWMlk3mhc4Z3FVusKwDvssR9vIU=; b=UUOoAhOMLkU8VgcJUUvobjp4UfsG+hkAE6KjrdWQ4klVoearAe1qdW+rvvJ+bZbbFR fwjQmMkC6eneLEAhCzQLdQFPPiRNC5xUYEpnQvX09zxfGSngmQHJrKGr6hY3xMAkaEzI fZ6vs/eQfbLlm7oXVTfQYE3cGVJjD2L8D7M64J8xjhEAyITHn87y+QSAHw3e04/Hg4lf fluMKWGUNUGLO8qEH6VmSFj/kb+zrqX3YaJnvu+tnndj8utx2YnYDQFheBC7q0hlR3eW kvELmk6uTp9xlpag5iuX33VtonOtllDrGEsDh67eV6UXuMZH9erFAYE3BR9kAHWu3IMU MRxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to; bh=qo/TU1ejSz6+TpiYNWMlk3mhc4Z3FVusKwDvssR9vIU=; b=uN/R6ICtlBryfvkyfk/a9f+CmCCuG05IiwTNA0Tf3ufixjSVWZ8PYgTA4stNWi3Vsb 5nFAzKPCB4oB+RvJ9LOz0sc+auOAYPMk64Y8CSDJJPE1VFGBzQ/9m0n5O8LEaw16e3ux rufv3S+4gzaVOoaHh5FmpqfstWrMtG7xemhnvRKVZmTU069R0X2fMpHcJg5TZGg6o600 KcykomKFafO+yeY+kJ+I/pkp4pvHowrD+ylOhPRIeLnpzqanB2ET2Ma8WN39nusStng4 NbscpIlDCMmucmLD47vdJTFILKSzvr5SoWOOa2knpgpdGLEvDqq/fpMKlr4g14M2BGSE 79bA== X-Gm-Message-State: AOAM530VmOW0OEU2+FrgRakXeV6FM7AHA+6jGlr/LT1/H7aibHh9/5tL RnE87y4jlxF7UVRhv16EHA3GFaCtB+9lWYJRCd3ddgRMqfk= X-Google-Smtp-Source: ABdhPJyE1dt5thsII6E9Vu3PQUwm0KSooHf7gko5/sh0wTzGwb4zsFkNT9cw41tHiaeu9eeWKge5+Ctfka24iuD5jpU= X-Received: by 2002:a05:6402:30a5:: with SMTP id df5mr1160839edb.24.1616556389977; Tue, 23 Mar 2021 20:26:29 -0700 (PDT) MIME-Version: 1.0 References: <20210322113538.17442986@fsol> <219f4c59-34a9-a273-2bcc-e587440b2910@airmettle.com> In-Reply-To: <219f4c59-34a9-a273-2bcc-e587440b2910@airmettle.com> Reply-To: emkornfield@gmail.com From: Micah Kornfield Date: Tue, 23 Mar 2021 20:26:19 -0700 Message-ID: Subject: Re: Cannot create default memory pool To: user@arrow.apache.org Content-Type: multipart/alternative; boundary="00000000000062504705be3fdd8e" --00000000000062504705be3fdd8e Content-Type: text/plain; charset="UTF-8" What is the source of the record batch? There was a patch since 3.0 that fixed some potential memory corruption when reading parquet in certain scenarios (but from the description it doesn't sound like libparquet is being used?) On Tue, Mar 23, 2021 at 8:04 PM Matt Youill wrote: > So this seems to be caused by the variable in memory_pool.cc: > > const util::optional user_selected_backend = > UserSelectedBackend(); > > being (or becoming) garbage. > > For some reason, after a few Gandiva batch evaluations > user_selected_backend is no longer "jemalloc" but "system" (probably > actually just null because "system" is 0) and after a while it isn't valid > at all and crashes. > > There aren't multiple copies of Arrow AFAICT but I do have two apps using > arrow. Both use libarrow.a, libarrow-glib.a and libgandiva.a... one (that > I'm not super familiar with) shows the above behavior and the other doesn't. > > On 22/3/21 10:27 pm, Matt Youill wrote: > > Could be the build creating multiple Arrows I suppose. It's a mixture of > quite an old Makefile calling cmake to build arrow and arrow c lib. > > Will double check. > > Thanks, Matt > > On Mon., 22 Mar. 2021, 9:35 pm Antoine Pitrou, wrote: > >> On Mon, 22 Mar 2021 19:34:19 +1100 >> Matt Youill wrote: >> > Hi, >> > >> > Not sure if anyone knows anything about this, but am getting a strange >> > error when evaluating a record batch with a gandiva filter... >> > >> > __GI_raise 0x00007f2b8f01718b >> > __GI_abort 0x00007f2b8eff6859 >> > arrow::util::ArrowLog::~ArrowLog() 0x000056309fe94c12 >> > arrow::default_memory_pool() 0x000056309fd6fff4 >> > gandiva::Annotator::PrepareEvalBatch(arrow::RecordBatch const&, >> > std::vector, >> > std::allocator > > const&) >> > 0x000056309facdfce >> > gandiva::LLVMGenerator::Execute(arrow::RecordBatch const&, >> > std::vector, >> > std::allocator > > const&) >> > 0x000056309faa66a2 >> > gandiva::Filter::Evaluate(arrow::RecordBatch const&, >> > std::shared_ptr) 0x000056309fa9ea1d >> > >> > >> > The error reported is "Internal error: cannot create default memory >> pool" >> > >> > I'm using jemalloc >> > >> > Not even really sure how a call to arrow::default_memory_pool() can >> > fail? This is only occurring in a release build if that helps? >> >> This logically should not happen. How did you compile Arrow and >> Gandiva? Do you have two versions of Arrow lying around perhaps? >> >> >> >> --00000000000062504705be3fdd8e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
What is the source of the record batch?=C2=A0 There was a = patch since 3.0 that fixed some potential memory corruption when reading pa= rquet in certain scenarios (but from the description it doesn't sound l= ike libparquet is being used?)=C2=A0

On Tue, Mar 23, 2021 at 8:04 PM Matt Yo= uill <matt.youill@airmettle= .com> wrote:
=20 =20 =20

So this seems to be caused by the variable in memory_pool.cc:

const util::optional<MemoryPoolBackend> user_selected_backend =3D UserSelectedBackend();

being (or becoming) garbage.

For some reason, after a few Gandiva batch evaluations user_selected_backend is no longer "jemalloc" but "sys= tem" (probably actually just null because "system" is 0) and aft= er a while it isn't valid at all and crashes.

There aren't multiple copies of Arrow AFAICT but I do have two apps using arrow. Both use libarrow.a, libarrow-glib.a and libgandiva.a... one (that I'm not super familiar with) shows the above behavior and the other doesn't.


On 22/3/21 10:27 pm, Matt Youill wrote:
=20
Could be the build creating multiple Arrows I suppose. It'= s a mixture of quite an old Makefile calling cmake to build arrow and arrow c lib.

Will double check.

Thanks, Matt

On Mon., 22 Mar. 2021, 9:35 pm Antoine Pitrou, <antoine@python.org> wrote:
On Mon, 22 Mar 2021 19:34:19 +1100
Matt Youill <matt.youill@airmettle.com> wrote:
> Hi,
>
> Not sure if anyone knows anything about this, but am getting a strange
> error when evaluating a record batch with a gandiva filter...
>
> __GI_raise 0x00007f2b8f01718b
> __GI_abort 0x00007f2b8eff6859
> arrow::util::ArrowLog::~ArrowLog() 0x000056309fe94c12 > arrow::default_memory_pool() 0x000056309fd6fff4
> gandiva::Annotator::PrepareEvalBatch(arrow::RecordBatch const&,
> std::vector<std::shared_ptr<arrow::ArrayData>,
> std::allocator<std::shared_ptr<arrow::ArrayData> > > const&)
> 0x000056309facdfce
> gandiva::LLVMGenerator::Execute(arrow::RecordBatch const&,
> std::vector<std::shared_ptr<arrow::ArrayData>,
> std::allocator<std::shared_ptr<arrow::ArrayData> > > const&)
> 0x000056309faa66a2
> gandiva::Filter::Evaluate(arrow::RecordBatch const&,
> std::shared_ptr<gandiva::SelectionVector>) 0x000056309fa9ea1d
>
>
> The error reported is "Internal error: cannot creat= e default memory pool"
>
> I'm using jemalloc
>
> Not even really sure how a call to arrow::default_memory_pool() can
> fail? This is only occurring in a release build if that helps?

This logically should not happen.=C2=A0 How did you compile Arrow and
Gandiva?=C2=A0 Do you have two versions of Arrow lying around perhaps?



--00000000000062504705be3fdd8e--