avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Kenworthy <adwkenwor...@yahoo.com>
Subject Re: Does Avro GenericData.Record violate the .equals contract?
Date Fri, 10 Feb 2012 12:26:01 GMT
Hallo Doug,

Thank you for your feedback. I agree that implicitly using Order.IGNORE to ignore differences
in records makes sense, as that is the criteria used to define distinction when sorting. But
it looks as though only the schema name is checked when deciding whether to examine each field
or not. This can, as the test below shows, result in a lack of symmetry when using equals
if one is not careful (i.e. the example is a "bad" one as it's not a good idea to have two
schemas with the same name and namespace yet with different contents, but shows how one might
inadvertently make a wrong assumption about equality):-

@Test
public void test() {
Schema schema1 = Schema.createRecord("test_record", null, "my.namespace", false);
List<Field> fields1 = new ArrayList<Field>();
fields1.add(new Field("attribute1", Schema.create(Schema.Type.STRING), null, null, Order.IGNORE));
schema1.setFields(fields1);
Schema schema2 = Schema.createRecord("test_record", null, "my.namespace", false);
List<Field> fields2 = new ArrayList<Field>();
fields2.add(new Field("attribute1", Schema.create(Schema.Type.STRING), null, null, Order.ASCENDING));
schema2.setFields(fields2);
GenericRecord record1 = new GenericData.Record(schema1);
record1.put("attribute1", "1");
GenericRecord record2 = new GenericData.Record(schema2);
record2.put("attribute1", "2");
System.out.println(record1.equals(record2)); // returns TRUE
System.out.println(record2.equals(record1)); // returns FALSE
}

Andrew



>________________________________
> From: Doug Cutting <cutting@apache.org>
>To: user@avro.apache.org 
>Sent: Thursday, February 9, 2012 8:49 PM
>Subject: Re: Does Avro GenericData.Record violate the .equals contract?
> 
>On 02/09/2012 07:02 AM, Andrew Kenworthy wrote:
>> This means that if I have no sorting defined in my schema, that all
>> records are treated as being equal to one another.
>
>If you specify "order":"ignore" for all fields in a record, then, yes,
>all instances of that record would be equal.  I cannot imagine a case
>where this would be useful, but I also don't see how this would violate
>the equals() contract.
>
>The default for fields is to behave as if "order":"ascending" is
>specified.  Records are equal if all of their fields that are not
>specified as "order":"ignore" are equal.
>
>Doug
>
>
>
Mime
View raw message