arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Browne <tho...@crvm.io>
Subject Question the nature of the "Zero Copy" advantages of Apache Arrow
Date Tue, 26 Jan 2021 17:47:13 GMT
So one of the big advantages of Arrow is the common format in memory, on 
the wire, across languages.

I get that this makes it very easy and fast to transfer data between 
nodes, and between languages, which will all share the in-memory format 
and therefore the (often expensive) serialisation step is removed.

However, is it true that one of the core objectives of the project is 
also to allow shared memory objects across different languages on the 
same node? For example, a fast C-based ingest system constantly 
populates a pyarrow buffer, which can be read directly by any other 
application on that node, through pointer sharing?

If this is a core objective, what is the canonical way for brokering the 
"pointers" to this data between languages? Is it the Plasma store? And 
if so, are there plans for Plasma to move be implemented in other client 
languages?

In short. Is Plasma (or if not Plasma, the functionality it provides 
implemented some other way), a core objective of the project?

Or instead is Flight supposed to be used between languages on the same 
node, and if so, does Flight provide true zero-copy (ie - the same 
buffer, not copying the buffer) if run between processes on the same node?

Many thanks.

Mime
View raw message