arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Cutler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARROW-369) [Python] Add ability to convert multiple record batches at once to pandas
Date Mon, 28 Nov 2016 07:03:59 GMT

    [ https://issues.apache.org/jira/browse/ARROW-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15701155#comment-15701155
] 

Bryan Cutler commented on ARROW-369:
------------------------------------

PR: https://github.com/apache/arrow/pull/216

> [Python] Add ability to convert multiple record batches at once to pandas
> -------------------------------------------------------------------------
>
>                 Key: ARROW-369
>                 URL: https://issues.apache.org/jira/browse/ARROW-369
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Wes McKinney
>              Labels: newbie
>
> Instead of only being able to only convert single single record batches and tables that
consist only of single ColumnChunks, we should also support the construction of Pandas DataFrames
from multiple RecordBatches. In the most simple way, we would convert each batch to a Pandas
DataFrame and then concat them all together. A second (and preferred) implementation would
extend the C++ function {{ConvertColumnToPandas}} in {{python/src/pyarrow/adapters/pandas.*}}
to work on chunked columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message