arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philipp Moritz (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ARROW-4757) Nested chunked array support
Date Mon, 04 Mar 2019 08:20:00 GMT
Philipp Moritz created ARROW-4757:
-------------------------------------

             Summary: Nested chunked array support
                 Key: ARROW-4757
                 URL: https://issues.apache.org/jira/browse/ARROW-4757
             Project: Apache Arrow
          Issue Type: Improvement
            Reporter: Philipp Moritz


Dear all,

I'm currently trying to lift the 2GB limit on the python serialization. For this, I implemented
a chunked union builder to split the array into smaller arrays.

However, some of the children of the union array can be ListArrays, which can themselves contain
UnionArrays which can contain ListArrays etc. I'm at a bit of a loss how to handle this. In
principle I'd like to chunk the children too. However, currently UnionArrays can only have
children of type Array, and there is no way to treat a chunked array (which is a vector of
Arrays) as an Array to store it as a child of a UnionArray. Any ideas how to best support
this use case?

-- Philipp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message