arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jonathan mercier <jonathan.merc...@cnrgh.fr>
Subject Can I load from a parquet file only few columns ?
Date Fri, 12 Feb 2021 14:21:44 GMT
Dear,
I have a parquet files with 300 000 columns and 30 000 rows.
If I load a such file to pandas dataframe (with pyarrow) that take
around 100 GO of ram. 

As I perform a pairwise comparison between column I could load those
data by N columns by N columns. 

So is it possible to load from a parquet file only few columns by their
names ? Which will save some memory.

Thanks


-- 
                Researcher computational biology
                PhD, Jonathan MERCIER
            
                Bioinformatics (LBI)
                2, rue Gaston
                Crémieux
                91057 Evry Cedex
            
            
                Tel :(+33)1 60 87 83 44
                Email :jonathan.mercier@cnrgh.fr
                
            


Mime
View raw message