arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiaozhen Liu <>
Subject Does Arrow Flight use memory-mapped files for IPC within the same host?
Date Wed, 22 Jul 2020 12:59:23 GMT
Hi everyone,

Lately, I’ve been experimenting with Arrow Flight. For now, I think it is really great,
especially when I’m not planning on building my own IPC framework (as I’ve mentioned earlier
I’m trying to use Arrow to communicate between Java and Python processes). And the data
transfer speed is very satisfactory, although I haven’t tried very big data.
However, I’m wondering this: when I’m using Arrow Flight to do IPC within the same machine,
is there any kind of optimization? And by optimization I mean will Flight internally use something
like memory-mapped files to transfer data? Because even though Flight optimizes speed, if
it still transfers data over the wire it cannot be faster than shared-memory (file), right?
I know this may be strange since Arrow Flight is an RPC framework and will probably be better
suited for communication between different hosts. But the fact that it also provides an RPC
protocol that saves me the trouble of building my own IPC framework makes me choose Flight
to do IPC (currently still on the same host). 
I know that KNIME Analytics Platform also uses Arrow for IPC, and it also uses temp Arrow
file to transfer data. I can also do this within the framework of Arrow Flight by simply passing
the location of temp files in the messages. But first I just want to see if it is already
implemented by Flight internally. 
I’ve looked up the source code of Flight and haven’t found anything that looks like what
I’m describing. Am I missing something, or is this the case, Flight doesn’t (and doesn’t
plan to ) use file for IPC within the same host?

Thank you.

Xiaozhen Liu

View raw message