arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John E. Conlon <>
Subject [Java AvroToArrow] Creating Arrow Files from Avro
Date Tue, 29 Dec 2020 06:33:05 GMT
Creating a DataEngineering pipeline that will create transform binary Avro objects in S3 buckets
to S3 Arrow objects and Parquet objects.  

See that Java libraries don't support Parquet at this time so I plan to first use the Arrow
Java libraries for the Avro->Arrow transform and then use the Python Arrow to do the Arrow->Parquet

On the Java side I plan to download my Avro objects to a file, then create the Arrow files
and then upload these.  

See the AvroToArrow.avroToArrowIterator(schema, decoder, config) also see the tests using
AvroToArrow but even though I have read the limited documentation I am not sure how to use
go about using this to read the Avro files and write output Arrow file. 

Can someone provide me with an example? 

View raw message