avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hong Tang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (AVRO-6) Better support for using customized in memory types with Avro GenericDatumReader and GenericDatumWriter
Date Sun, 12 Apr 2009 07:39:14 GMT

     [ https://issues.apache.org/jira/browse/AVRO-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hong Tang updated AVRO-6:
-------------------------

        Fix Version/s: 1.0
    Affects Version/s: 1.0
               Status: Patch Available  (was: Open)

The patch includes the following changes:
* Changed Schema.getFields() to return a map of <String, Field> where Field contains
a position and the field Schema.
* Changed the "protected" interface of GenericDatumReader and GenericDatumWriter so that both
classes can be easily sub-classed to support generic types that are not derived from GenericRecord
or GenericArray. Specifically,
** Added a set of isXXX() methods to test whether an object is of a specific type. All but
the following types can be customized: Integer, Long, Float, Double, Boolean.
** Added a getType() method that by default calls the isXXX() methods, it is provided in case
a sub-class may reimplement it more efficiently.
** InstanceOf() is implemented on top of getType().
** Added access methods for RECORD type (newRecord(), getField(), addField(), removeField()),
ARRAY type (newArray(), addToArray(), peekArray(), getArrayElements(), getArraySize()), and
MAP type (newMap(), addToMap(), getMapEntrySet(), getMapSize()).
* Other minor changes:
** Make the object reuse code more consistent. E.g. newRecord(), newArray(), newMap() all
take the old object as the first parameter.
** parameterized GenericDatumReader and GenericDatumWriter, in cases where a sub-class always
take a specific type as the root type of all objects (and thus it would behave as DatumReader<T>
instead of DatumReader<Object>).

This patch simply refactored the interfaces of RecordSchema, GenericDatumReader and GenericDatumWriter,
and does not fix any bugs or add new features. No additional unit tests are included.

> Better support for using customized in memory types with Avro GenericDatumReader and
GenericDatumWriter
> -------------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-6
>                 URL: https://issues.apache.org/jira/browse/AVRO-6
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.0
>            Reporter: Hong Tang
>             Fix For: 1.0
>
>         Attachments: avro-6.patch
>
>
> Currently Avro's GenericDatumReader/Writer requires Record, Array, and Map be subclasses
of GenericRecord, GenericArray, and Map. Additionally, STRING and BYTES are mapped to Utf8
and ByteBuffer. Finally, Record fields are accessed through field names, this may be less
efficient if a user-defined record class supports field access by positions (such as PIG Tuples).
> I suggest we improve the interface to (1) have more flexibility to use user-types with
Avro; (2) support access to RECORDs by either field names or field positions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message