avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-656) writing unions with multiple records, fixed or enums can choose wrong branch
Date Sun, 20 Feb 2011 09:14:38 GMT

    [ https://issues.apache.org/jira/browse/AVRO-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997115#comment-12997115

Scott Carey commented on AVRO-656:

I solved the Unit test error. ResolvingGrammarGenerator.bestBranch() was only checking for
records, it needs to check fixed and enum too:

       int j = 0;
       for (Schema b : r.getTypes()) {
         if (vt == b.getType())
-          if (vt == Schema.Type.RECORD) {
-            String vname = w.getName();
-            if (vname == null || vname.equals(b.getName()))
+          if (vt == Schema.Type.RECORD || vt == Schema.Type.ENUM || 
+              vt == Schema.Type.FIXED) {
+            String vname = w.getFullName();
+            String bname = b.getFullName();
+            if ((vname != null && vname.equals(bname))
+                || vname == bname)
               return j;
           } else
             return j;

{quote}The intent there was to be back-compatible, to not require a schema be passed to the
constructor, but perhaps that's not worth it.{quote}

Perhaps we can allow null to equal null, and strings to match, but disallow null to match
This will cause problems if people mix/match them, but will work with the old constructors
if the user is consistent.   It would cause problems if mix/matched, and So maybe that is
not worth it.

> writing unions with multiple records, fixed or enums can choose wrong branch 
> -----------------------------------------------------------------------------
>                 Key: AVRO-656
>                 URL: https://issues.apache.org/jira/browse/AVRO-656
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.4.0
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>            Priority: Blocker
>             Fix For: 1.5.0
>         Attachments: AVRO-656.patch, AVRO-656.patch, AVRO-656.patch, AVRO-656.patch,
> According to the specification, a union may contain multiple instances of a named type,
provided they have different names.  There are several bugs in the Java implementation of
this when writing data:
>  - for record, only the short-name of the record is checked, so the branch for a record
of the same name in a different namespace may be used by mistake
>  - for enum and fixed, the name of the record is not checked, so the first enum or fixed
in the union will always be assumed when writing.  in many cases this may cause the wrong
data to be written, potentially corrupting output.
> This is not a regression.  This has never been implemented correctly by Java.  Python
and Ruby never check names, but rather perform a full, recursive validation of content.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message