arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Karthick <manojkarth...@ymail.com>
Subject Re: [Rust] [Parquet] Combining Parquet files
Date Sat, 06 Feb 2021 20:10:35 GMT
Thank you. The guide was really helpful!

On Sat, Feb 6, 2021 at 4:23 AM Fernando Herrera <
fernando.j.herrera@gmail.com> wrote:

> Hi,
>
> Have a look at this
>
> https://elferherrera.github.io/arrow_guide/reading_parquet.html
>
> It may give you and idea of the things you want to do
>
> On Fri, Feb 5, 2021 at 8:29 PM Manoj Karthick <manojkarthick@ymail.com>
> wrote:
>
>> Hi,
>>
>> I've been playing around with the Rust Parquet library and was trying to
>> understand how to combine Parquet files. I'm new to Rust and the Arrow
>> ecosystem, so I'd appreciate some help in figuring this out.
>>
>> I'm looking for a way to naively merge Parquet files. For example, if we
>> have input files: A, B, C - I would like to create an output Parquet file
>> that has the row groups from A, B and C placed one after the other (I
>> understand this might be inefficient, but this mostly for development
>> purposes).
>>
>> What would be the best way to achieve this? Also how should the
>> FileMetaData be updated to reflect the new row groups and number of rows?
>>
>> Thank you!
>>
>

Mime
View raw message