avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Avro schema and data read with it.
Date Wed, 17 Dec 2014 17:55:12 GMT
Avro skips over fields that were present in the writer's schema but
are no longer present in the reader's schema.  Skipping is
substantially faster than reading for most types.  For known-size
types like string, bytes, fixed, double and float the file pointer can
be incremented past skipped values.  For skipped structures like
records, maps and arrays, no memory is allocated and no stores are
made.  Avro data files are not in a columnar format however, so the
i/o and decompression of skipped fields is not generally avoided.

Doug

On Wed, Dec 17, 2014 at 7:53 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com> wrote:
> I have a data that is persisted in Avro format. Each record has a certain
> schema and it contains 10 fields while it is persisted.
>
> When I read the same record(s) from other process, i also specify a schema
> with a subset of fields (5).
>
> Will only 5 columns be read from disk?
> or
> Will all the columns be read but 5 are later discarded?
> or
> Are all the columns read but only five are accessible since the schema used
> to read contain only five columns?
>
> Please suggest.
>
> Regards,
> Deepak
>

Mime
View raw message