arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jonathan mercier <>
Subject Can I load from a parquet file only few columns ?
Date Fri, 12 Feb 2021 14:21:44 GMT
I have a parquet files with 300 000 columns and 30 000 rows.
If I load a such file to pandas dataframe (with pyarrow) that take
around 100 GO of ram. 

As I perform a pairwise comparison between column I could load those
data by N columns by N columns. 

So is it possible to load from a parquet file only few columns by their
names ? Which will save some memory.


                Researcher computational biology
                PhD, Jonathan MERCIER
                Bioinformatics (LBI)
                2, rue Gaston
                91057 Evry Cedex
                Tel :(+33)1 60 87 83 44

View raw message