avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-695) Cycle Reference Support
Date Thu, 06 Jan 2011 19:42:46 GMT

    [ https://issues.apache.org/jira/browse/AVRO-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978470#action_12978470

Doug Cutting commented on AVRO-695:

The current patch is a specification change, since it adds a new schema type, "cycle".  It
is back-compatible, but not forward-compatible: implementations that do not implement cycles
would be unable to read data that contains cycle schemas.

Instead, a schema for a cycle reference might be defined.  For example, one could define a
org.apache.avro.cycles.CycleReference record containing a single integer field.  CycleReference
would only be used in unions with types that it may refer to.  The DatumWriter would keep
a IdentityHashMap<Object,Integer> of records, maps and arrays written, adding an entry
the first time an instance is seen, and writing a CycleReference for subsequent occurrences
in appropriate unions.  The DatumReader would then keep an array of all records, maps and
arrays that have been read and, when it reads a CycleReference in a union, return a pointer
to the indicated element of that array.

Adding cycles should not slow applications that do not require this feature.  They could be
implemented in newly defined CycleDatumReader/Writer, or perhaps GenericDatumReader and GenericDatumWriter
could be modified to optionally handle such cycles.

> Cycle Reference Support
> -----------------------
>                 Key: AVRO-695
>                 URL: https://issues.apache.org/jira/browse/AVRO-695
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>    Affects Versions: 1.4.1
>            Reporter: Moustapha Cherri
>             Fix For: 1.5.0
>         Attachments: avro-1.4.1-cycle.patch.gz, avro-1.4.1-cycle.patch.gz
>   Original Estimate: 672h
>  Remaining Estimate: 672h
> This is a proposed implementation to add cycle reference support to Avro. It basically
introduce a new type named Cycle. Cycles contains a string representing the path to the other
> For example if we have an object of type Message that have a member named previous with
type Message too. If we have have this hierarchy:
> message
>   previous : message2
> message2
>   previous : message2
> When serializing the cycle path for "message2.previous" will be "previous".
> The implementation depend on ANTLR to evaluate those cycle at read time to resolve them.
I used ANTLR 3.2. This dependency is not mandated; I just used ANTLR to speed thing up. I
kept in this implementation the generated code from ANTLR though this should not be the case
as this should be generated during the build. I only updated the Java code.
> I did not make full unit testing but you can find "avrotest.Main" class that can be used
a preliminary test.
> Please do not hesitate to contact me for further clarification if this seems interresting.
> Best regards,
> Moustapha Cherri

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message