arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <emkornfi...@gmail.com>
Subject Re: Java Parquet to Arrow Conversion
Date Tue, 18 Aug 2020 16:24:31 GMT
Hi Chris,
There is an open PR to support this through C++'s Dataset functionality
[1]. There was also a prior attempt that went stale and I can't find at the
moment.

IIUC the main missing component at this point before the PR gets merged is
integration to honor "-XX:MaxDirectMemorySize" settings.

-Micah

[1] https://github.com/apache/arrow/pull/7030



[1] https://github.com/apache/arrow/pull/7030

On Tue, Aug 18, 2020 at 6:48 AM Chris Nuernberger <chris@techascent.com>
wrote:

> Hey,
>
> We were wondering what the best way to convert a parquet file to an arrow
> file would be via a java pathway.  I notice that the c++ layer appears to
> have this conversion.
>
> The best hint I have see so far is this gist:
> https://gist.github.com/animeshtrivedi/76de64f9dab1453958e1d4f8eca1605f
>
> I also found this jni pathway for ORC files:
> https://github.com/apache/arrow/tree/master/cpp/src/jni
>
> Another thought I had was to use the JNA or JNR and bind to the C glib
> pathway.
>
> Thanks for any help,
>
> Chris
>

Mime
View raw message