I believe there are also some low level Java bindings (not sure of their quality).

On Mon, Jan 4, 2021 at 10:15 AM Neal Richardson <neal.p.richardson@gmail.com> wrote:
I believe Plasma only has Python bindings. FWIW it has not seen active development in quite a while.

Neal

On Mon, Jan 4, 2021 at 8:58 AM Chris Nuernberger <chris@techascent.com> wrote:
Yes that makes sense.  I guess you also need something to broker shared memory filenames/ids.  The database isn't in-memory, however, although I know what you mean.  One huge advantage of mmap is you can have much larger than memory storage act like in-memory storage; so the plasma store can be roughly the size of your disk and larger your ram but your program, unless it attempts to verbatim copy a column wouldn't know any better.

Numerical larger-than-memory-but-in-memory redis indeed; that is an interesting way to think of it.  

On Mon, Jan 4, 2021 at 9:45 AM Thomas Browne <thomas@crvm.io> wrote:

Interesting and agreed. I guess this a big advantage of the "on the wire" unserialised format - just read it in and it's already native. I'll go this way possibly.

However I also note the beginnings of more advanced functionality in the Plasma store, for example, notification API on buffer seal (ie when something changes, all clients can be notified).

https://arrow.apache.org/docs/python/generated/pyarrow.plasma.PlasmaClient.html#pyarrow.plasma.PlasmaClient.subscribe

I'm assuming the plasma store will add functionality over time, and if this is the case, having all client libraries implement it means I can almost have a redis-like column-store specialising in numerical computation (which would be awesome), and for which i don't need to write my own functionality for each client library.

A numerical in-memory database, if you will.

On 04/01/2021 15:55, Chris Nuernberger wrote:
Julia, Python, and R all have some support for mmap operations.

On Mon, Jan 4, 2021 at 8:55 AM Chris Nuernberger <chris@techascent.com> wrote:
Could simply saving the arrow file in streaming mode to shared memory and then mmap-ing the result in each language solve your problem ?  Plasma seems to me to be a layer on top of basic mmap operations; as long as you have shared memory and mmap then you can have multiple processes talking to the same logical block of memory.

On Mon, Jan 4, 2021 at 8:27 AM Thomas Browne <thomas@crvm.io> wrote:
I am hoping to use the Apache Arrow project for cross-language numerical
computation, and for that the shared-memory idea is very powerful. Am I
correct that the Plasma Store is the enabling technology for this,
especially for soft real-time computation (ie not moving to parquet or
any file-based sharing system)?

Is that the case? And if so, then I'm wondering which client libraries,
other than Python (and I assume C[++]), implement the Plasma Store. This
table doesn't feature a row for Plasma:

https://arrow.apache.org/docs/status.html

and I can't seem to find any reference to the Plasma store in the Julia,
R, or Javascript libraries.

https://arrow.apache.org/docs/r/

https://arrow.apache.org/docs/js/

https://arrow.juliadata.org/stable/


Thank you,

Thomas