arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Browne <tho...@crvm.io>
Subject Re: Plasma store implementation status across client libraries
Date Mon, 04 Jan 2021 16:45:01 GMT
Interesting and agreed. I guess this a big advantage of the "on the 
wire" unserialised format - just read it in and it's already native. 
I'll go this way possibly.

However I also note the beginnings of more advanced functionality in the 
Plasma store, for example, notification API on buffer seal (ie when 
something changes, all clients can be notified).

https://arrow.apache.org/docs/python/generated/pyarrow.plasma.PlasmaClient.html#pyarrow.plasma.PlasmaClient.subscribe

I'm assuming the plasma store will add functionality over time, and if 
this is the case, having all client libraries implement it means I can 
almost have a redis-like column-store specialising in numerical 
computation (which would be awesome), and for which i don't need to 
write my own functionality for each client library.

A numerical in-memory database, if you will.

On 04/01/2021 15:55, Chris Nuernberger wrote:
> Julia, Python, and R all have some support for mmap operations.
>
> On Mon, Jan 4, 2021 at 8:55 AM Chris Nuernberger <chris@techascent.com 
> <mailto:chris@techascent.com>> wrote:
>
>     Could simply saving the arrow file in streaming mode to shared
>     memory and then mmap-ing the result in each language solve your
>     problem ?  Plasma seems to me to be a layer on top of basic mmap
>     operations; as long as you have shared memory and mmap then you
>     can have multiple processes talking to the same logical block of
>     memory.
>
>     On Mon, Jan 4, 2021 at 8:27 AM Thomas Browne <thomas@crvm.io
>     <mailto:thomas@crvm.io>> wrote:
>
>         I am hoping to use the Apache Arrow project for cross-language
>         numerical
>         computation, and for that the shared-memory idea is very
>         powerful. Am I
>         correct that the Plasma Store is the enabling technology for
>         this,
>         especially for soft real-time computation (ie not moving to
>         parquet or
>         any file-based sharing system)?
>
>         Is that the case? And if so, then I'm wondering which client
>         libraries,
>         other than Python (and I assume C[++]), implement the Plasma
>         Store. This
>         table doesn't feature a row for Plasma:
>
>         https://arrow.apache.org/docs/status.html
>
>         and I can't seem to find any reference to the Plasma store in
>         the Julia,
>         R, or Javascript libraries.
>
>         https://arrow.apache.org/docs/r/
>
>         https://arrow.apache.org/docs/js/
>
>         https://arrow.juliadata.org/stable/
>
>
>         Thank you,
>
>         Thomas
>
>

Mime
View raw message