avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xu Yang <yang.jim...@gmail.com>
Subject Avro C++ library potential bug in union type
Date Wed, 08 Feb 2017 18:37:51 GMT
Hello Community,

We are currently working on integrating avro into our products. We are
exciting about avro's schema evolution feature however we found in avro-cpp
library it didn't support it very well in old versions (1.7.0 which are
what we used before)

I found some JIRAs(AVRO-1360
<https://issues.apache.org/jira/browse/AVRO-1360> & AVRO-1474
<https://issues.apache.org/jira/browse/AVRO-1474>) online which indicate
some schema-evolution bugs has been fixed since 1.7.7 so we upgrade our
avro-cpp library from 1.7.0 to latest 1.8.1, it did resolve the problem in
old avro-cpp however we found it breaks some our existing tests after
upgrade which seems like a regression.

According to the avro-cpp spec (http://avro.apache.org/docs/1.7.7/spec.html).
Since 1.7.7, it added a new note in Union type section: *"Note that when
a default value
<http://avro.apache.org/docs/1.7.7/spec.html#schema_record> is specified
for a record field whose type is a union, the type of the default value
must match the first element of the union. Thus, for unions containing
"null", the "null" is usually listed first, since the default value of such
unions is typically null." *

Based on the description, union type like ["null","string"] should only
have default value "deafult:null", and this works fine in 1.8.1. While
other unions like ["string","null"] should have default value like
"default:"test"", this failed in latest version when it trying construct a
avro schema object from string. It also failed for other similar cases like
["int","null"] or ["float","null'].

I have divided into the avro-cpp source code a little bit. In the failed
case it seems hit an assert to force dafault value type is always
json::etObject if it is not json::etNull; which for me, it seems not always
correct, the default value type can be string or int or whatever as long as
it matches the first element type in the union according to the spec.

avro-cpp-1.8.1\impl\Compiler.cc:
282:     case AVRO_UNION:
283:     {
284:         GenericUnion result(n);
285:         string name;
286:         Entity e2;
287:         if (e.type() == json::etNull) {
288:             name = "null";
289:             e2 = e;
290:         } else {
291:             assertType(e, json::etObject);
292:             const map<string, Entity>& v = e.objectValue();
293:             if (v.size() != 1) {
294:                 throw Exception(boost::format("Default value for "
295:                     "union has more than one field: %1%") %
e.toString());
296:             }
297:             map<string, Entity>::const_iterator it = v.begin();
298:             name = it->first;
299:             e2 = it->second;
300:         }

it seems all the codes above has been added in svn revision *1606545
<https://svn.apache.org/viewvc?view=revision&sortby=date&revision=1606545>*
by
@thiru to fix JIRA  AVRO-1474 I mentioned above.
https://svn.apache.org/viewvc/avro/trunk/lang/c%2B%2B/impl/Compiler.cc?view=log&sortby=date&pathrev=1606545


I have already create a minimal repo which can constantly reproduce this
problem, can I file a Jira to track this problem? I will attach my repo
there.


Thank you very much!
Yang

Mime
View raw message