arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Chan <j4ck....@gmail.com>
Subject [Rust] [DataFusion] Reading remote parquet files in S3?
Date Fri, 12 Feb 2021 22:36:33 GMT
Hi. I'm interested in reading parquet files stored in S3. I would like to
be able to do the followings:
1. read a single s3 file;
2. read all files in a s3 directory; and
3. read some files matching patterns in a s3 directory.

Currently, parquet.rs only supports local disk files. Potentially, this can
be done using the rusoto crate that provides a s3 client. What would be a
good way to do this?
1. create a remote parquet reader (potentially duplicate lots of code)
2. create an interface to abstract away reading from local/remote files
(not sure about performance if the reader blocks on every operation)

Jack

Mime
View raw message