arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yiannis Gkoufas <johngou...@gmail.com>
Subject Re: Understanding "shared" memory implications
Date Wed, 16 Mar 2016 00:27:24 GMT
Hi Wes,

can you please clarify something I don't understand? The next versions of
arrow will include the shared memory control flow as well?
So then, what is needed for HBase (for instance) to be integrated is the
adapter to the arrow format?
If yes, then who will be responsible for keeping the data locality in the
regionservers? i.e it would make sense to keep in the RAM of the
regionserver data that are "close" to those stored in the data node.
Hope it makes sense.

Regards,
Yiannis



On 15 March 2016 at 23:02, Wes McKinney <wes@cloudera.com> wrote:

> My understanding is that you can use java.nio.MappedByteBuffer to work
> with memory-mapped files as one way to share memory pages between Java
> (and non-Java) processes without copying.
>
> I am hoping that we can reach a POC of zero-copy Arrow memory sharing
> Java-to-Java and Java-to-C++ in the near future. Indeed this will have
> huge implications once we get it working end to end (for example,
> receiving memory from a Java process in Python without a heavy ser-de
> step -- it's what we've always dreamed of) and with the metadata and
> shared memory control flow standardized.
>
> - Wes
>
> On Wed, Mar 9, 2016 at 9:25 PM, Corey J Nolet <cjnolet@gmail.com> wrote:
> > If I understand correctly, Arrow is using Netty underneath which is
> using Sun's Unsafe API in order to allocate direct byte buffers off heap.
> It is using Netty to communicate between "client" and "server", information
> about memory addresses for data that is being requested.
> >
> > I've never attempted to use the Unsafe API to access off heap memory
> that has been allocated in one JVM from another JVM but I'm assuming this
> must be the case in order to claim that the memory is being accessed
> directly without being copied, correct?
> >
> > The implication here is huge. If the memory is being directly shared
> across processes by them being allowed to directly reach into the direct
> byte buffers, that's true shared memory. Otherwise, if there's copies going
> on, it's less appealing.
> >
> >
> > Thanks.
> >
> > Sent from my iPad
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message