avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-108) add binary comparator
Date Fri, 28 Aug 2009 16:40:32 GMT

    [ https://issues.apache.org/jira/browse/AVRO-108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748893#action_12748893

Doug Cutting commented on AVRO-108:

An API for this might be something like:

  BinaryComparator.compare(byte[] bytes1, int start1, byte[] bytes2, int start2, Schema schema);

The schema provided must be the schema used to write the data.

Records would be ordered using the order of their fields, arrays and maps by their entries,
unions by their branches, etc.

> add binary comparator
> ---------------------
>                 Key: AVRO-108
>                 URL: https://issues.apache.org/jira/browse/AVRO-108
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
> Hadoop MapReduce performance benefits greatly if data may be compared without deserializing
to an object, but rather by examining its serialized bytes directly.  Such "raw" comparators
are typically written by hand in Hadoop, and are very fragile.
> With Avro it is possible to generically compare two serialized byte sequences if their
schema is known.  This should work for any Avro data, regardless of how it was serialized
or how it will be deserialized.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message