arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Karthick <>
Subject [Rust] [Parquet] Combining Parquet files
Date Fri, 05 Feb 2021 20:28:32 GMT

I've been playing around with the Rust Parquet library and was trying to
understand how to combine Parquet files. I'm new to Rust and the Arrow
ecosystem, so I'd appreciate some help in figuring this out.

I'm looking for a way to naively merge Parquet files. For example, if we
have input files: A, B, C - I would like to create an output Parquet file
that has the row groups from A, B and C placed one after the other (I
understand this might be inefficient, but this mostly for development

What would be the best way to achieve this? Also how should the
FileMetaData be updated to reflect the new row groups and number of rows?

Thank you!

View raw message