From user-return-757-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Tue Nov 10 02:10:47 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id EB10B18065C for ; Tue, 10 Nov 2020 03:10:46 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id 37391651F9 for ; Tue, 10 Nov 2020 02:10:46 +0000 (UTC) Received: (qmail 361 invoked by uid 500); 10 Nov 2020 02:10:45 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 351 invoked by uid 99); 10 Nov 2020 02:10:44 -0000 Received: from spamproc1-he-fi.apache.org (HELO spamproc1-he-fi.apache.org) (95.217.134.168) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Nov 2020 02:10:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamproc1-he-fi.apache.org (ASF Mail Server at spamproc1-he-fi.apache.org) with ESMTP id 30A09BFD6E for ; Tue, 10 Nov 2020 02:10:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamproc1-he-fi.apache.org X-Spam-Flag: NO X-Spam-Score: -0.201 X-Spam-Level: X-Spam-Status: No, score=-0.201 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamproc1-he-fi.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([116.203.227.195]) by localhost (spamproc1-he-fi.apache.org [95.217.134.168]) (amavisd-new, port 10024) with ESMTP id Dg5sX6_ymBJj for ; Tue, 10 Nov 2020 02:10:43 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.167.41; helo=mail-lf1-f41.google.com; envelope-from=wesmckinn@gmail.com; receiver= Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com [209.85.167.41]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id E546CBC2A1 for ; Tue, 10 Nov 2020 02:10:42 +0000 (UTC) Received: by mail-lf1-f41.google.com with SMTP id j205so7934456lfj.6 for ; Mon, 09 Nov 2020 18:10:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=kniuRFvGc/XS+OQ8nLXrbknZZ/ribJni/dQEQgRD/A0=; b=n3nRR7Gq1zdPtHS1n+KDjYQmdjFkSYJiHM1kUy+HYxX1syhEH1CPNRCm+0hReP+tDM N/44BtA8sruufwLREF57Q47OEYJYfEv90NAocWaR4i+cJRGqifA8DbwyBEvpN8tLaHxR TR0QpGw2ymzIVsuHZNYbld3UIbpIje39KMDBZ5NnuFbQ4D88O0YoSsCZgE5XdC+sZvdf tDGKHFXO5QwttggGcLUmt6wg8sPRpORqlNYp5RZZ17/bsGE+kjoYkNj1At4GS+xEKY7d 8K+Km8K+lQgKmMLw0rJVu3/NmRsqunlx78SYb+8iLYqk+XukAniBgvrgW9MII4z5Pb0T 5oSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=kniuRFvGc/XS+OQ8nLXrbknZZ/ribJni/dQEQgRD/A0=; b=iFso9TVI7WUjE+rmDVKRmhMCn84nmO0BMvqzHvS39T4Isu8G9ckUvCYVWgdHpyYe+M Zgsj3uJzPNskF+uMLt8kGZ1Mrmhfg/CAUp9xPIr4kFtHpTZ0l/tsWGVmmUwmXGEzvvlK tDmJJ5XwLqHnGVeR232laCOYVBs+ZTdDZR4UZnlbWJ2sNALDpQOKWriA8bUr4lSTnjqb iUytfjsRuasj37H8ormCBTNiFKphh+B5qVn6tktd5u0FZVUG7dKvC6gkyNen18PZUxfU 9o/YDI3qhel7A14Gp6BojMMZoHvLt9r5+LRoAZNyxiotd3l/iT762oa+0wIi8i2g3sD/ PYrA== X-Gm-Message-State: AOAM533X0lPsB6ZR3b3s7w2OEPKWjYeUlyYfWzMqf6U5CAKcUsOO7WPk 3HWCEgpiQgTcv9cBzbEARyan175RYxgtXRx0z9EQf6MtNYA= X-Google-Smtp-Source: ABdhPJyy/E9hx1G3aK9HeGp4SxoYFV/UGyHWkaAwfEcGGHVjHuQEUPeJKTWS0BQ/0o+tgT6Q65zFmhcrOF9ubXpzJbQ= X-Received: by 2002:a19:48c2:: with SMTP id v185mr6329222lfa.429.1604974241413; Mon, 09 Nov 2020 18:10:41 -0800 (PST) MIME-Version: 1.0 References: <9219cb7c-b332-aac4-d884-d211a97423bd@gmx.com> In-Reply-To: From: Wes McKinney Date: Mon, 9 Nov 2020 20:10:05 -0600 Message-ID: Subject: Re: Arrow C++ API - memory management To: user@arrow.apache.org Content-Type: text/plain; charset="UTF-8" The memory should automatically be freed by any object / shared_ptr / unique_ptr destruction. On Linux we use a background jemalloc thread by default so it may not be freed immediately but it should not be held indefinitely. In any case if you can reproduce the issue consistently we'd be glad to take a look, please open a Jira issue and provide as much information as you can to make it easy for us to reproduce On Mon, Nov 9, 2020 at 9:41 AM Maciej Skrzypkowski wrote: > > OK, thanks for the answer. > > mArrowTable is "std::shared_ptr mArrowTable" so should be managed properly by the shared pointer. I've narrowed down the problem to code like this: > > void LoadCSVData::ReadArrowTableFromCSV( const std::string & filePath ) > { > auto tableReader = CreateTableReader( filePath ); > //ReadArrowTableUsingReader( *tableReader ); > } > > std::shared_ptr LoadCSVData::CreateTableReader( const std::string & filePath ) > { > arrow::MemoryPool* pool = arrow::default_memory_pool(); > auto tableReader = arrow::csv::TableReader::Make( pool, OpenCSVFile( filePath ), > *PrepareReadOptions(), *PrepareParseOptions(), *PrepareConvertOptions() ); > if ( !tableReader.ok() ) > { > throw BadParametersException( std::string( "CSV file reader error: " ) + tableReader.status().ToString() ); > } > return *tableReader; > } > > Still memory is getting filled while calling ReadArrowTableFromCSV many times. Is the arrow's memory pool freed while destruction of TableReader? Or should I free it explicitly? > > > On 09.11.2020 15:01, Wes McKinney wrote: > > We'd prefer to answer questions on the mailing list or Jira (if > something looks like a bug). > > There isn't enough detail on the SO question to understand what other > things might be going on, but you are never destroying > this->mArrowTable which is holding on to allocated memory. If the > memory use keeps going up through repeated calls to the CSV reader > that sounds like a possible leak, so we would need to see more > details, including about your platform. > > On Mon, Nov 9, 2020 at 2:33 AM Maciej Skrzypkowski > wrote: > > Hi All! > > I don't understand memory management in C++ Arrow API. I have some > memory leaks while using it. I've created Stackoverflow question, maybe > someone would answer it: > https://stackoverflow.com/questions/64742588/how-to-manage-memory-while-reading-csv-using-apache-arrow-c-api > . > > Thanks, > Maciej Skrzypkowski >