arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <...@cloudera.com>
Subject Re: Arrow examples
Date Mon, 29 Feb 2016 22:46:30 GMT
hi Dmitriy,

I created the following JIRA
https://issues.apache.org/jira/browse/SPARK-13534 related to PySpark
which seems relevant. I would be happy to collaborate with you on
this. Since I understand that the Spark developers are exploring an
in-memory columnar layout for Spark DataFrames/Datasets and Spark SQL
any conversion code we write right now may end up being temporary.
Hopefully the Spark columnar memory layout will end up being very
nearly the same as the official Arrow layout so that limited or no
conversion will be necessary.

Thanks
Wes

On Wed, Feb 24, 2016 at 12:38 PM, Dmitriy Morozov <int.256h@gmail.com> wrote:
> Hello everyone,
>
> I'm just starting with Arrow. I'd like to see how good Arrow at caching
> when used in conjunction with Allixio (Tachyon). The use case that I'm
> going to validate involves reading data from Spark's DataFrame, storing in
> Tachyon in Arrow and then reading back into DataFrame. I checked the source
> code of Arrow but couldn't find any examples or tests. Can anyone guide me
> please where should I start looking at in order to convert DataFrame to a
> Arrow struct?
>
> Thanks!
> Dmitriy

Mime
View raw message