From user-return-150-archive-asf-public=cust-asf.ponee.io@arrow.apache.org Mon Jul 8 07:18:34 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 098BE180665 for ; Mon, 8 Jul 2019 09:18:33 +0200 (CEST) Received: (qmail 963 invoked by uid 500); 8 Jul 2019 07:18:33 -0000 Mailing-List: contact user-help@arrow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@arrow.apache.org Delivered-To: mailing list user@arrow.apache.org Received: (qmail 952 invoked by uid 99); 8 Jul 2019 07:18:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Jul 2019 07:18:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 5997EC05D9 for ; Mon, 8 Jul 2019 07:18:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.128 X-Spam-Level: *** X-Spam-Status: No, score=3.128 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, PDS_NO_HELO_DNS=1.327, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id ea76MF-pC6Zb for ; Mon, 8 Jul 2019 07:18:30 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.173; helo=mail-lj1-f173.google.com; envelope-from=seb.binet@gmail.com; receiver= Received: from mail-lj1-f173.google.com (mail-lj1-f173.google.com [209.85.208.173]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 1288DBC52B for ; Mon, 8 Jul 2019 07:18:30 +0000 (UTC) Received: by mail-lj1-f173.google.com with SMTP id z28so5864159ljn.4 for ; Mon, 08 Jul 2019 00:18:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=Gdd5NwP431ixnO1rw4i9hjo/7DsQdhNb466IoolUcFk=; b=o+Gp1vtN8gdyGqPFeqUDTjYrfcF98JIcSTCLvw8ldLHqKP+LVRw0JojlriWctDXh2c pmROtFNL3V5Dvhwty68BQRIa9pwnKwzFnTDUM4iplemQ2Nqo4cvG5OaY6+cjK7HwYouA C8IAckumJnNWRpS6sANVsQ019hCNsOTtEdfoYwILVEfWDhkvqmf/khIypT0gWmHvJLyk 3slGSsxFruxGoJPq6kTQHpvEksp7Y3P5NMYM5L3dcxOjNuAd1VAFt12JT+ALub7qT8Oy n24NgbAYdxFd8IFlUs5dVdmELoyoSYO1jOWHQRu0fPG9yGQdIJcoN3gzbcAxVg5JvrwT 3uFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Gdd5NwP431ixnO1rw4i9hjo/7DsQdhNb466IoolUcFk=; b=d4Nyo6BFCQOeuC2WGZTvRI+G1llygXKuBaM9e6EHzp/OXhi9egyPrRvF+G8BRiEkA7 Nfya26ZIPDQ72bkDneBLNKag89ORa9+ycrXiYQdZuMjZ6vDHW++UOynnaRk+arwH3be6 EoPvyIwZleqmWLxjSxt6nXcvrq2y487jsBDZtKRzX/hdued5t1yDFHxv1nSpVElMG3Cs DwJJZrqccKeY/F85sVu2TeydFDlUb/+Crhl8LVuMY+JZzZrEZU72KDP60T/PbOI27MjO H5omYIw5SJ2/uu0BV6JE4ts5ywRoqHKiTGh9wqduPoWxycN9Q0YcLxzOL/OoK+nkKJyR J2lQ== X-Gm-Message-State: APjAAAXWbrJtQi06PbneY57To3wFsbMkqyeMyuqRUaROpX5rVoviE8XF Ea44bJbc/2ZUe4obZe99J4mOhYdayD3igLDneJJRbA== X-Google-Smtp-Source: APXvYqzBIEwCEny/KBgCfPNe+NuVj4NRk5/g49QiHgOlJCTpwyfp51IGDeK3u3PLVrs14zLbtzzZQp6b2xD4Dv8Ric0= X-Received: by 2002:a2e:9ac6:: with SMTP id p6mr9606276ljj.100.1562570308251; Mon, 08 Jul 2019 00:18:28 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Sebastien Binet Date: Mon, 8 Jul 2019 09:18:16 +0200 Message-ID: Subject: Re: Go / Python Sharing To: user@arrow.apache.org Content-Type: multipart/alternative; boundary="00000000000028f2dd058d264073" --00000000000028f2dd058d264073 Content-Type: text/plain; charset="UTF-8" As far as i know, Go does support IPC (as in the arrow IPC format) Another option which has been discussed at some point was to have a shared memory allocator so the arrow arrays could be shared between processes. I haven't looked in details what implementing plasma support for Go would need on the Go side... -s sent from my droid On Mon, Jul 8, 2019, 08:29 Miki Tebeka wrote: > Hi Clive, > > I'd like to understand the high level design for a system where a Go >> process can communicate an Arrow data structure to a python process on the >> same CPU >> > I see two options > - Different processes with hared memory, probably using plasma > - Same process. The either Go uses Python shared library or Python using > Go compiled to shared library (-build-mode=c-shared) > > >> - and for the python process to zero-copy gain access to that data, >> change it and inform the Go process. This is low latency so I don't want >> to save to file. >> > IIRC arrow is not built for mutation. You build an Array/Table once and > then use it. > > Would this need the use of Plasma as a zero-copy store for the data >> between the two processes or do I need to use IPC? But with IPC you are >> transferring the data which is not needed in this case as I understand it. >> Any pointers to examples would be appreciated. >> > See above about options. Note that currently the Go arrow implementation > doesn't support IPC or plasma (though it's in the works). > > Yoni & I are working on another option which is using the C++ arrow > library from Go. It does support plasma and since it uses the same > underlying C++ library that Python does you'll be able to pass a pointer > around without copying data. It's at very alpha-ish state but you're more > than welcomed to give it a try - https://github.com/353solutions/carrow > > Happy hacking, > Miki > --00000000000028f2dd058d264073 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
As far as i know, Go does support IPC (as in the arrow IP= C format)

Another option which= has been discussed at some point was to have a shared memory allocator so = the arrow arrays could be shared between processes.
=
I haven't looked in details what implementi= ng plasma support for Go would need on the Go side...

-s


sent from my droid

On = Mon, Jul 8, 2019, 08:29 Miki Tebeka <miki@353solutions.com> wrote:
Hi Clive,

=
I'd like to understand the high lev= el design for a system where a Go process can communicate an Arrow data str= ucture to a python process on the same CPU
I see two options
- Different processes with hared mem= ory, probably using plasma
- Same process. The either Go uses Py= thon shared library or Python using Go compiled to shared library (-build-m= ode=3Dc-shared)
=C2=A0
- and for the python process to zero-cop= y gain access to that data, change it and inform the Go process.=C2=A0 This= is low latency so I don't want to save to file.
IIRC arrow is not built for mutation.=C2=A0= You build an Array/Table once and then use it.

=
Would this need the use of Plasma as a zero-copy store for the da= ta between the two processes or do I need to use IPC? But with IPC you are = transferring the data which is not needed in this case as I understand it. = Any pointers to examples would be appreciated.
See above about options. Note that currently the Go arrow implement= ation doesn't support IPC or plasma (though it's in the works).

Yoni & I are working = on another option which is using the C++ arrow library from Go. It does sup= port plasma and since it uses the same underlying C++ library that Python d= oes you'll be able to pass a pointer around without copying data. It= 9;s at very alpha-ish state but you're more than welcomed to give it a = try - https://github.com/353solutions/carrow
<= div>
Happy hacking,
Miki=C2=A0
--00000000000028f2dd058d264073--