arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Corey Nolet <cjno...@apache.org>
Subject Re: Question about mutability
Date Wed, 24 Feb 2016 19:52:10 GMT
So far, how does the integration with the Spark project look? Do you
envision cached Spark partitions allocating Arrows? I could imagine this
would be absolutely huge for being able to ask questions of real-time data
sets across applications.

On Wed, Feb 24, 2016 at 2:49 PM, Zhe Zhang <zhz@apache.org> wrote:

> Thanks for the insights Jacques. Interesting to learn the thoughts on
> zero-copy sharing.
>
> mmap allows sharing address spaces via filesystem interface. That has some
> security concerns but with the immutability restrictions (as you clarified
> here), it sounds feasible. What other options do you have in mind?
>
> On Wed, Feb 24, 2016 at 11:30 AM Corey Nolet <cjnolet@gmail.com> wrote:
>
> > Agreed,
> >
> > I thought the whole purpose was to share the memory space (using possibly
> > unsafe operations like ByteBuffers) so that it could be directly shared
> > without copy. My interest in this is to have it enable fully in-memory
> > computation. Not just "processing" as in Spark, but as a fully in-memory
> > datastore that one application can expose and share with others (e.g. In
> > memory structure is constructed from a series of parquet files, somehow,
> > then Spark pulls it in, does some computations, exposes a data set,
> > etc...).
> >
> > If you are leaving the allocation of the memory to the applications and
> > underneath the memory is being allocated using direct bytebuffers, I
> can't
> > see exactly why the problem is fundamentally hard- especially if the
> > applications themselves are worried about exposing their own memory
> spaces.
> >
> >
> > On Wed, Feb 24, 2016 at 2:17 PM, Andrew Brust <
> > andrew.brust@bluebadgeinsights.com> wrote:
> >
> > > Hmm...that's not exactly how Jaques described things to me when he
> > briefed
> > > me on Arrow ahead of the announcement.
> > >
> > > -----Original Message-----
> > > From: Zhe Zhang [mailto:zhz@apache.org]
> > > Sent: Wednesday, February 24, 2016 2:08 PM
> > > To: dev@arrow.apache.org
> > > Subject: Re: Question about mutability
> > >
> > > I don't think one application/process's memory space will be made
> > > available to other applications/processes. It's fundamentally hard for
> > > processes to share their address spaces.
> > >
> > > IIUC, with Arrow, when application A shares data with application B,
> the
> > > data is still duplicated in the memory spaces of A and B. It's just
> that
> > > data serialization/deserialization are much faster with Arrow (compared
> > > with Protobuf).
> > >
> > > On Wed, Feb 24, 2016 at 10:40 AM Corey Nolet <cjnolet@gmail.com>
> wrote:
> > >
> > > > Forgive me if this question seems ill-informed. I just started
> looking
> > > > at Arrow yesterday. I looked around the github a tad.
> > > >
> > > > Are you expecting the memory space held by one application to be
> > > > mutable by that application and made available to all applications
> > > > trying to read the memory space?
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message