arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wes McKinney (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARROW-649) Explore a Weld/Arrow converter
Date Fri, 17 Mar 2017 21:45:41 GMT

    [ https://issues.apache.org/jira/browse/ARROW-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930778#comment-15930778
] 

Wes McKinney commented on ARROW-649:
------------------------------------

>From https://github.com/weld-project/weld/tree/master/python/grizzly, it appears that
Weld knows how to operate on contiguous C memory, but I'll have to dig deeper to understand
all the details. If that's the case, then building a bridge in C to pass contiguous memory
held in Arrow C++ arrays should not be complicated.

As one logistical matter with missing data, Weld may not yet be able to interact with Arrow's
validity bitmaps. We'll want to make sure that there's a primitive operator in the Weld DSL
(or a plan to implement one) that can handle bitmap propagation in operations.

Looks like Weld does not support null data yet: https://github.com/weld-project/weld/blob/master/python/grizzly/grizzly_impl.py#L285
— so the benchmarks presented aren't exactly apples to apples (having missing data handling
in all pandas operations comes at high expense).

I'm also interested to enable Weld to understand Arrow's string memory layout (offsets + data
buffers). 

> Explore a Weld/Arrow converter
> ------------------------------
>
>                 Key: ARROW-649
>                 URL: https://issues.apache.org/jira/browse/ARROW-649
>             Project: Apache Arrow
>          Issue Type: New Feature
>            Reporter: Jacques Nadeau
>
> [~matei] and the Stanford team have just open sourced Weld. It would be interesting to
evaluate how we could move Arrow data to Weld's internal representation.
> Weld is here: https://github.com/weld-project/weld



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message