crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Magnus Runesson <>
Subject Re: Reading Avro to GenericRecord
Date Mon, 27 Jan 2014 17:50:02 GMT
Thanks for quick answer.

It is totally OK and reasonable to take one file in a directory and 
assume all other has the same schema.

On 2014-01-27 18:27, Josh Wills wrote:
> No, I haven't written a way to do that yet, and I feel bad about it-- 
> a Clouderan asked me for just such a feature a couple of weeks ago and 
> it slipped my mind. I don't think it's hard to do, just a little 
> tedious and will require refreshing my memory of the Avro APIs. 
> There's also the potential issue that multiple Avro files in the same 
> input directory can have different schemas, so the one we would end up 
> reading might be somewhat arbitrary (e.g., based on the timestamp of 
> the files in the directory, or some such thing)-- is that ok?
> On Mon, Jan 27, 2014 at 9:12 AM, Magnus Runesson < 
> <>> wrote:
>     Can I in (s)crunch read an Avro-file to GenericRecord without
>     provide the schema? I want crunch to get the schema from the
>     avro-file it reads. How do I do it?
>     /Magnus

View raw message