arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: Pyarrow: best way to store scheme
Date Sun, 11 Aug 2019 15:36:18 GMT
hi Igor -- you can use the Schema.serialize and pyarrow.read_schema methods

https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_ipc.py#L640

For short term storage you can also pickle schemas

I didn't find read_schema in the API reference
(http://arrow.apache.org/docs/python/api/ipc.html) so I opened a
documentation JIRA

https://issues.apache.org/jira/browse/ARROW-6201

- Wes

On Fri, Aug 9, 2019 at 4:47 AM Игорь Ястребов <ig.yastrebov@gmail.com>
wrote:
>
> Hi everyone!
>
> Is there a recommended way to store schemata for Arrow tables on disk? I want to load
them later to provide information to csv reader (by constructing a dictionary or directly
if it gets implemented in the future). This is necessary to read multiple csv files that follow
the same origin but may get wrong inferred type due to a lack of data in this particular file
(null fields, integer types instead of float types).

Mime
View raw message