arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fernando Herrera <fernando.j.herr...@gmail.com>
Subject Re: [Rust] [Parquet] Combining Parquet files
Date Sat, 06 Feb 2021 12:22:48 GMT
Hi,

Have a look at this

https://elferherrera.github.io/arrow_guide/reading_parquet.html

It may give you and idea of the things you want to do

On Fri, Feb 5, 2021 at 8:29 PM Manoj Karthick <manojkarthick@ymail.com>
wrote:

> Hi,
>
> I've been playing around with the Rust Parquet library and was trying to
> understand how to combine Parquet files. I'm new to Rust and the Arrow
> ecosystem, so I'd appreciate some help in figuring this out.
>
> I'm looking for a way to naively merge Parquet files. For example, if we
> have input files: A, B, C - I would like to create an output Parquet file
> that has the row groups from A, B and C placed one after the other (I
> understand this might be inefficient, but this mostly for development
> purposes).
>
> What would be the best way to achieve this? Also how should the
> FileMetaData be updated to reflect the new row groups and number of rows?
>
> Thank you!
>

Mime
View raw message