avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Can serialized Avro records be efficiently compared without deserializing?
Date Wed, 23 May 2012 17:28:35 GMT
On Tue, May 22, 2012 at 1:22 PM, Jonathan Coveney <jcoveney@gmail.com> wrote:
> Imagine I use Avro to serialize an object (without loss of generality let's
> say an array of longs). I'm curious if it is possible to compare those
> arrays without deserializing... ie look at the bytes in memory or on disk,
> and do the comparison based on those bytes (ie the raw comparison that
> Hadoop does in the shuffle sort).
>
> I poked around the documentation but wasn't sure where to look.

Yes, this is possible.

The Java method that does this is BinaryData#compare().

http://avro.apache.org/docs/current/api/java/org/apache/avro/io/BinaryData.html#compare(byte[],
int, byte[], int, org.apache.avro.Schema)

Doug

Mime
View raw message