arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Melo <andrew.m...@gmail.com>
Subject Re: Java dataframe library for arrow suggestions
Date Tue, 16 Mar 2021 23:43:39 GMT
I can't speak to how complete it is, but I looked earlier for
something similar and ran across
https://github.com/deeplearning4j/nd4j .. it's probably not an exact
fit, but it does appear to be able to consume arrow buffers and expose
them to java.

Cheers
Andrew

On Tue, Mar 16, 2021 at 6:36 PM Wes McKinney <wesmckinn@gmail.com> wrote:
>
> This has been asked several times in the past but I'm not aware of
> anything "dataframe-like" in Java that's build against Arrow (or
> otherwise) that fills the kind of need that pandas does. There was a
> Scala project some years ago Saddle [1] (not Arrow-based) built
> initially by one of the early pandas developers but I don't think it's
> still being actively developed. To build a higher-level Java API on
> top of the Arrow Java libraries would be incredibly useful to the
> community I'm sure.
>
> [1]: https://github.com/saddle/saddle
>
> On Tue, Mar 16, 2021 at 5:06 PM Paul Whalen <pgwhalen@gmail.com> wrote:
> >
> > Hi,
> >
> > I've been using Arrow for some time now, mostly in the context of Arrow Flight between
Java and Python.  While it's quite easy to convert Arrow data in Python to a pandas dataframe
and manipulate it, I'm struggling to find an obvious analogue on the Java side.  VectorSchemaRoot
is useful for loading/unloading/moving data, but clumsy for doing higher level operations,
especially joins/aggregations/etc across "tables".
> >
> > In other words, if I wanted to load non Arrow formatted data from somewhere into
Java, manipulate it with a dataframe like API, and then send the result somewhere via Flight,
what library would be the best/simplest way to accomplish that?  I see lots of progress in
other languages, but I'm wondering what would be recommended for Java.
> >
> > I'm currently looking at Spark SQL just in-application, but that seems a touch heavyweight,
and I'm not sure it would do exactly what I've described (nor am I terribly familiar with
Spark in the first place).
> >
> > If the premise of this question is flawed, please feel free to correct me.
> >
> > Thanks!
> > Paul

Mime
View raw message