avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Linehan <pline...@plinehan.com>
Subject Re: schema resolution and record names
Date Mon, 04 Oct 2010 21:06:41 GMT
the "problem" i'm having is that i seem to be getting alias-like
functionality without using aliases.  i put "problem" in quotes because i
actually like the behavior, i just don't see how it jives with the spec.
 maybe a code example is a better way to go about this.

i create a data file as follows:

Schema schemaA = ...
Schema schemaB = ...
GenericDatumWriter datumWriter = new GenericDatumWriter(schemaA);
DataFileWriter fileWriter = new DataFileWriter(datumWriter);
OutputStream out = new FileOutputStream("datafile.avro");
fileWriter.create(schemaA, out);

both schemaA and schemaB contain a single record definition, each with
exactly the same primitive-type fields; same types, same names, same order.
 however, the record names and namespaces differ.

using "avro-tools getschema", i can see that the schema stored in the file
is schemaA.  also, if i create a GenericDatumReader and read the file, the
returned GenericRecord values have a schema of schemaA.

however, i can also read the file using a SpecificDatumReader which is
initialized to the specific type corresponding to schemaB (let's call that
class RecordB), the schema which does _not_ match the schema of the file:

SpecificDatumReader datumReader = new SepcificDatumReader(RecordB.class);
DataFileReader fileReader = new DataFileReader(new File("datafile.avro"),
RecordB record = fileReader.next();

examining the fields of "record" i see that the data has been parsed
correctly, as if RecordB's schema (the "reader's schema") was correctly
resolved with schemaA (the "writer's schema").

is this the expected behavior in this case?  does this not seem to
contradict the schema resolution portions of the spec?  is this behavior
specific to DataFileReader, since i "forced" the record type upon the

also, thanks for taking the time to reply.  i very much appreciate it.


On Mon, Oct 4, 2010 at 1:10 PM, Doug Cutting <cutting@apache.org> wrote:

> On 10/01/2010 05:45 PM, Patrick Linehan wrote:
>> am i misunderstanding the documentation?  is the behavior i'm seeing
>> expected?  when does a record name/namespace conflict actually cause an
>> error to be thrown?
> The alias feature in Avro 1.4 will let you read records whose name or
> namespace differ:
> http://avro.apache.org/docs/current/spec.html#Aliases
> Does that help?
> Doug

View raw message