From issues-return-62362-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Tue May 4 15:09:03 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id CF22E1806D9 for ; Tue, 4 May 2021 17:09:02 +0200 (CEST) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id 0249F3FA43 for ; Tue, 4 May 2021 15:09:01 +0000 (UTC) Received: (qmail 74192 invoked by uid 500); 4 May 2021 15:09:01 -0000 Mailing-List: contact issues-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@arrow.apache.org Delivered-To: mailing list issues@arrow.apache.org Received: (qmail 74107 invoked by uid 99); 4 May 2021 15:09:01 -0000 Received: from ec2-52-204-25-47.compute-1.amazonaws.com (HELO mailrelay1-ec2-va.apache.org) (52.204.25.47) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 May 2021 15:09:01 +0000 Received: from jira2-he-de.apache.org (jira2-he-de.apache.org [168.119.33.54]) by mailrelay1-ec2-va.apache.org (ASF Mail Server at mailrelay1-ec2-va.apache.org) with ESMTPS id 6A883420A5 for ; Tue, 4 May 2021 15:09:01 +0000 (UTC) Received: from jira2-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira2-he-de.apache.org (ASF Mail Server at jira2-he-de.apache.org) with ESMTP id 5F049C806F7 for ; Tue, 4 May 2021 15:09:00 +0000 (UTC) Date: Tue, 4 May 2021 15:09:00 +0000 (UTC) From: "Alessandro Molina (Jira)" To: issues@arrow.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (ARROW-12650) [Python] Improve documentation regarding dealing with memory mapped files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Alessandro Molina created ARROW-12650: ----------------------------------------- Summary: [Python] Improve documentation regarding dealing with= memory mapped files Key: ARROW-12650 URL: https://issues.apache.org/jira/browse/ARROW-12650 Project: Apache Arrow Issue Type: Improvement Reporter: Alessandro Molina While one of the Arrow promises is that it makes easy to read/write data bi= gger than memory, it's not immediately obvious from the pyarrow documentati= on how to deal with memory mapped files. We hint that you can open files as memory mapped ( [https://arrow.apache.or= g/docs/python/memory.html?highlight=3Dmemory_map#on-disk-and-memory-mapped-= files]=C2=A0) but then we don't explain how to read/write Arrow Arrays or T= ables from there. While most high level functions to read/write formats (pqt, feather, ...) h= ave an easy to guess {{memory_map=3DTrue}} option, we don't have any exampl= e of how that is meant to work for Arrow format itself. For example how you= can do that using=C2=A0{{RecordBatchFile*}}.=C2=A0 An addition to the memory mapping section that makes a more meaningful exam= ple that reads/writes actual arrow data (instead of plain bytes) would prob= ably be more helpful -- This message was sent by Atlassian Jira (v8.3.4#803005)