avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-695) Cycle Reference Support
Date Mon, 09 Mar 2015 22:26:39 GMT

    [ https://issues.apache.org/jira/browse/AVRO-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353768#comment-14353768
] 

ASF GitHub Bot commented on AVRO-695:
-------------------------------------

GitHub user sachingsachin opened a pull request:

    https://github.com/apache/avro/pull/23

    AVRO-695: Support for circular references.

    All objects are put into a temporary thread-local hash-map whose key is the object and
value is an integer ID.
    If any object is seen again while serializaing, its ID is taken from the hash-map, wrapped
into 'CircularRef' class and the CircularRef wrapper is serialized instead.
    
    On deserializing, if the CircularRef is encountered, we know that it has to be the ID
of a previously seen object.
    And so we restore the same.
    
    On the schema side, we create unions of all classes with CircularRef if the user suspects
circular references in his code. This union makes sure the above writers are able to write
a CircularRef instead of the actual object.
    
    Note that this strategy is perfectly safe in other languages' deserialization of a circularly
referenced data.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sachingsachin/avro AVRO-695

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/avro/pull/23.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23
    
----
commit e3f9295474a3f45c022850cb4ac2ba84a8ac31f4
Author: Sachin Goyal <sgoyal@walmart.com>
Date:   2015-03-09T22:18:01Z

    AVRO-695: Support for circular references.
    
    All objects are put into a temporary thread-local hash-map whose key is the object and
value is an integer ID.
    If any object is seen again while serializaing, its ID is taken from the hash-map,
    wrapped into 'CircularRef' class and the CircularRef wrapper is serialized instead.
    On deserializing, if the CircularRef is encountered, we know that it has to be the ID
of a previously seen object.
    And so we restore the same.
    
    On the schema side, we create unions of all classes with CircularRef if the user suspects
circular references in his code.
    This union makes sure the above writers are able to write a CircularRef instead of the
actual object.
    
    Note that this strategy is perfectly safe in other languages' deserialziation of a circularly
referenced data.

----


> Cycle Reference Support
> -----------------------
>
>                 Key: AVRO-695
>                 URL: https://issues.apache.org/jira/browse/AVRO-695
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>    Affects Versions: 1.7.6
>            Reporter: Moustapha Cherri
>         Attachments: AVRO-695.patch, AVRO-695.patch, PERF_8000_cycles.zip, avro-1.4.1-cycle.patch.gz,
avro-1.4.1-cycle.patch.gz, avro_circular_references.zip, avro_circular_refs6.patch, avro_circular_refs7.patch,
avro_circular_refs_2014_06_14.zip, circular_refs_and_nonstring_map_keys_2014_06_25.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This is a proposed implementation to add cycle reference support to Avro. It basically
introduce a new type named Cycle. Cycles contains a string representing the path to the other
reference.
> For example if we have an object of type Message that have a member named previous with
type Message too. If we have have this hierarchy:
> message
>   previous : message2
> message2
>   previous : message2
> When serializing the cycle path for "message2.previous" will be "previous".
> The implementation depend on ANTLR to evaluate those cycle at read time to resolve them.
I used ANTLR 3.2. This dependency is not mandated; I just used ANTLR to speed thing up. I
kept in this implementation the generated code from ANTLR though this should not be the case
as this should be generated during the build. I only updated the Java code.
> I did not make full unit testing but you can find "avrotest.Main" class that can be used
a preliminary test.
> Please do not hesitate to contact me for further clarification if this seems interresting.
> Best regards,
> Moustapha Cherri



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message