From user-return-945-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Tue Jan 26 18:07:09 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-ec2-va.apache.org (mxout1-ec2-va.apache.org [3.227.148.255]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 5FDD0180633 for ; Tue, 26 Jan 2021 19:07:09 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-ec2-va.apache.org (ASF Mail Server at mxout1-ec2-va.apache.org) with SMTP id 9716E44F60 for ; Tue, 26 Jan 2021 18:07:08 +0000 (UTC) Received: (qmail 76968 invoked by uid 500); 26 Jan 2021 18:07:08 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 76957 invoked by uid 99); 26 Jan 2021 18:07:08 -0000 Received: from spamproc1-he-de.apache.org (HELO spamproc1-he-de.apache.org) (116.203.196.100) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jan 2021 18:07:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamproc1-he-de.apache.org (ASF Mail Server at spamproc1-he-de.apache.org) with ESMTP id 6A6D41FF39B for ; Tue, 26 Jan 2021 18:07:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamproc1-he-de.apache.org X-Spam-Flag: NO X-Spam-Score: 0.205 X-Spam-Level: X-Spam-Status: No, score=0.205 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.2, KAM_SHORT=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamproc1-he-de.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=techascent-com.20150623.gappssmtp.com Received: from mx1-ec2-va.apache.org ([116.203.227.195]) by localhost (spamproc1-he-de.apache.org [116.203.196.100]) (amavisd-new, port 10024) with ESMTP id EUZpPOueIgwZ for ; Tue, 26 Jan 2021 18:07:06 +0000 (UTC) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=209.85.218.46; helo=mail-ej1-f46.google.com; envelope-from=chris@techascent.com; receiver= Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 91FA1BCC48 for ; Tue, 26 Jan 2021 18:07:05 +0000 (UTC) Received: by mail-ej1-f46.google.com with SMTP id by1so24312076ejc.0 for ; Tue, 26 Jan 2021 10:07:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=techascent-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=NbwTZRkzNL7x/y4yoz64Ks1+H3eoL8XaFAAranuwIeg=; b=WRkFG3DnX0pba0RRw2g2A0GK6peOllzPnQyM5o51tBUHktuJkyWpZgfXKHcoluDiDF sxjMLpL6hzmWGQflG9GgYrbBJx7gov2ZvonzxWvYKFK7wsktN4hbZXsE2vqAlxa1sF11 +YpkOWUcnio0H7NepMKti7fGTSWHj2TRKthQvH4HDyPgxcKoX/0gkhbZuvxFyi7QLZns AsdPmZwI+KUCRxfq/UIy6ryTftpZUVp6i45Xke1WgsNj1bCw8hM3QWExqU9p+eenoihH nm59p6KlwHARosX+7UyeRohZ9CT/ruXJh8clfqditfLfJ0ttSM4zYeCqLdESW1+TnbGm 5/SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=NbwTZRkzNL7x/y4yoz64Ks1+H3eoL8XaFAAranuwIeg=; b=Erm3OvohmBs8VqbBZJpSHIKQa3tAD/59COJbXqAwVhTFf2HC/vwlE3SPkmkXps9KTH ihbWtXtsJYhttt9yUZmZfrExJyP5d+XXPxJMEh6V3wk5danyGd5J+MRAQ4vK6T7kzDos 3ahuMyUqQ/cn0aSZdPgno/VNZZdYZS65bJYMFCd92C7hS8r8sDQqtuhG0gsiXQ1VlKyI T6swUF1NKtMTqG0HABYfztt8tZv8gUKvcSuM0JkE+G0LKR7mF+ExzZcQFRdecUzXqMMG ILz9J/wS1aWdDT7iIquepdCZwuxyWWKNlnbz55MeVF3NqjSbLq5aK6MhMinLQQO6dTC/ mVmA== X-Gm-Message-State: AOAM5313IKcczi39D+UFpqnxE+eAxemuZg33/HrD8R/WYdY0Nqp84dyr NoqzPnCmqssHFpPvsesoXZ5rct6KrPSB7dXyUYtGx+MDJYU= X-Google-Smtp-Source: ABdhPJykmSoELOBsAuyhX/M4HmuSf2m2BEe9kEgUfPNH3z/cvnr4Udj3KiXR+jC5uV9RwJDakEmZetKtwD0kSJYQuRM= X-Received: by 2002:a17:906:4690:: with SMTP id a16mr4319026ejr.442.1611684423436; Tue, 26 Jan 2021 10:07:03 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Chris Nuernberger Date: Tue, 26 Jan 2021 11:06:52 -0700 Message-ID: Subject: Re: Question the nature of the "Zero Copy" advantages of Apache Arrow To: user@arrow.apache.org Content-Type: multipart/alternative; boundary="0000000000008c9dbb05b9d1853e" --0000000000008c9dbb05b9d1853e Content-Type: text/plain; charset="UTF-8" Or just mmap with MAP_SHARED . On Tue, Jan 26, 2021 at 11:01 AM Daniel Nugent wrote: > Is there a problem with just using a RAM disk as the method for sharing > the arrow buffers? It just seems easier and less finicky than a separate > API to program against. > > It also makes storing the data permanently a lot more straightforward, I > think. > > -- > -Dan Nugent > On Jan 26, 2021, 12:47 -0500, Thomas Browne , wrote: > > So one of the big advantages of Arrow is the common format in memory, on > the wire, across languages. > > I get that this makes it very easy and fast to transfer data between > nodes, and between languages, which will all share the in-memory format > and therefore the (often expensive) serialisation step is removed. > > However, is it true that one of the core objectives of the project is > also to allow shared memory objects across different languages on the > same node? For example, a fast C-based ingest system constantly > populates a pyarrow buffer, which can be read directly by any other > application on that node, through pointer sharing? > > If this is a core objective, what is the canonical way for brokering the > "pointers" to this data between languages? Is it the Plasma store? And > if so, are there plans for Plasma to move be implemented in other client > languages? > > In short. Is Plasma (or if not Plasma, the functionality it provides > implemented some other way), a core objective of the project? > > Or instead is Flight supposed to be used between languages on the same > node, and if so, does Flight provide true zero-copy (ie - the same > buffer, not copying the buffer) if run between processes on the same node? > > Many thanks. > > --0000000000008c9dbb05b9d1853e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Or just mmap with MAP_SHARED.=C2=A0

On Tue, Jan 26, 2021 at 11:01 AM= Daniel Nugent <nugend@gmail.com= > wrote:
Is there a problem with just using a RAM disk as the meth= od for sharing the arrow buffers? It just seems easier and less finicky tha= n a separate API to program against.

It also makes storing the data permanently a lot=C2=A0=C2=A0more straightfo= rward, I think.

--
-Dan Nugent
On Jan 26, 2021, 12:47 -0500, Thomas Brow= ne <thomas@crvm.io>, wrote:
So one of the big advantages of Arrow is the common forma= t in memory, on
the wire, across languages.

I get that this makes it very easy and fast to transfer data between
nodes, and between languages, which will all share the in-memory format
and therefore the (often expensive) serialisation step is removed.

However, is it true that one of the core objectives of the project is
also to allow shared memory objects across different languages on the
same node? For example, a fast C-based ingest system constantly
populates a pyarrow buffer, which can be read directly by any other
application on that node, through pointer sharing?

If this is a core objective, what is the canonical way for brokering the "pointers" to this data between languages? Is it the Plasma store= ? And
if so, are there plans for Plasma to move be implemented in other client languages?

In short. Is Plasma (or if not Plasma, the functionality it provides
implemented some other way), a core objective of the project?

Or instead is Flight supposed to be used between languages on the same
node, and if so, does Flight provide true zero-copy (ie - the same
buffer, not copying the buffer) if run between processes on the same node?<= br>
Many thanks.
--0000000000008c9dbb05b9d1853e--