arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <emkornfi...@gmail.com>
Subject Re: Apache Arrow Java
Date Mon, 04 Jan 2021 19:55:13 GMT
There are two approaches that might help:
1.  Using JPype functionality in pyarrow [1][2]
2.  Direct memory addresses can be obtained from ArrowBuf objects [3].
Gandiva [4] uses this approach to pass the address to C++, the python code
would potentially look similar



[1] https://github.com/apache/arrow/blob/master/python/pyarrow/jvm.py
[2]
https://uwekorn.com/2019/11/17/fast-jdbc-access-in-python-using-pyarrow-jvm.html
[3]
https://github.com/apache/arrow/blob/f7d47a37f0418a5e615702dd974d4231184b4c70/java/memory/memory-core/src/main/java/org/apache/arrow/memory/ArrowBuf.java#L231
[4]
https://github.com/apache/arrow/blob/master/java/gandiva/src/main/java/org/apache/arrow/gandiva/evaluator/Filter.java#L139

A side note: as far as I know Java doesn't currently support MMaped files

On Wed, Dec 30, 2020 at 7:08 AM Igor <igor@upgini.com> wrote:

> Hello Apache Arrow developers!
>
> We are using apache arrow library in java and python, using arrow-vector
> arrow-memory-unsafe in java and Pyarrow in python.
>
> We try to implement in memory zero copy DataFrame, but we can’t find
> appropriate API in java libraries to get memory address of our vectors from
> python. I have found that API in Pyarrow library, but not in java libraries.
>
> What we need:
> 1) Create vector in java, collect data in memory using arrow as memory map
> API
> 2) Get memory address or descriptor in java
> 3) Pass it to the python library Pyarrow
> 4) Read vector data
>
> We have problem in the point 2
>
> Tell us please, how we can do that. Thank you!
>
>
> Best regards,
> Eshtyganov Igor
> https://www.upgini.com
>

Mime
View raw message