Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 12379200C15 for ; Wed, 8 Feb 2017 19:38:02 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 10D26160B5A; Wed, 8 Feb 2017 18:38:02 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0FC2B160B49 for ; Wed, 8 Feb 2017 19:38:00 +0100 (CET) Received: (qmail 6877 invoked by uid 500); 8 Feb 2017 18:38:00 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 6867 invoked by uid 99); 8 Feb 2017 18:38:00 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Feb 2017 18:37:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 9C264182242 for ; Wed, 8 Feb 2017 18:37:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Ap1p84CCZyrH for ; Wed, 8 Feb 2017 18:37:58 +0000 (UTC) Received: from mail-ot0-f181.google.com (mail-ot0-f181.google.com [74.125.82.181]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B676B5F298 for ; Wed, 8 Feb 2017 18:37:57 +0000 (UTC) Received: by mail-ot0-f181.google.com with SMTP id 65so119741277otq.2 for ; Wed, 08 Feb 2017 10:37:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=dTmtVAlmfJwmtWJQ8SzGp1QSxzaUu0XAIdzh/eL8VSs=; b=gEATRp4ccyaXdTOAZChjvNX5x8/sNMw/n6phK7Yh5e+6dHaQnKeoS/iNgYxYwNo0ZT vUVyi/rBe8dIngUdrhmwKhoN7bhw2WQWcHTbG4uXOCuruKA0DXyXwJkxsCPfXwNLU/Y9 cfWtlZBp6SeQeZZHBqLhypu6douXLV6CTw4QAHsWx2L5H1NbWNEYLGotn3rnIU0kCv5w H484FGLexb6ANe7Xkoz4DtrMmfH040rGjcdvOd25NOLtQSDPkjVaEDnQsym3aAFC/vC3 YGo7hgbpSuoEMwWYJ6C1XEXMwmdUwq/Rnbfn/6BzX9KZ7zIHUXVsFE7WTAv35nOCAZbo 7FZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=dTmtVAlmfJwmtWJQ8SzGp1QSxzaUu0XAIdzh/eL8VSs=; b=K8THFwOyR55roymJnwximy435yit53TarQfy5LGl7Jk2aiOhwjau1U7/nM/Sd94Xxp Am6vsjTUq3R9gGSEWOwamd0M79EQC77zNTdYAgsjRBo5MpsN1S3ugo5NIXaJZyIhEFoe ievGMQKRrblkrvI2vSP4UaydAikFF1dYfiQd0egzoh0Oohn6rnvsfvNbyRcShANe4//H XPwDqJXphh2t1GRC6mE0daUuzym7O/53ZjWtHUQexZl1pC0zKNs7HmmTWtVoB3DEoUev cJRU7re3Mjye9pCezx2NI6EY014Mrh20P48AAu8X0cuam3Mg9Wge1IQ9i0M9SdNBvAIH Hn1A== X-Gm-Message-State: AMke39koTd1COLliweCDXiKB10EPe1my1fWH3cPtfcT1CAPLNoiNQZNK2+UAOUNdLxJYVVf7OQ1Rz1/IBGQOng== X-Received: by 10.157.21.19 with SMTP id u19mr10943514otf.229.1486579071382; Wed, 08 Feb 2017 10:37:51 -0800 (PST) MIME-Version: 1.0 Received: by 10.74.49.3 with HTTP; Wed, 8 Feb 2017 10:37:51 -0800 (PST) From: Xu Yang Date: Wed, 8 Feb 2017 13:37:51 -0500 Message-ID: Subject: Avro C++ library potential bug in union type To: user@avro.apache.org Content-Type: multipart/alternative; boundary=94eb2c1917c07afcf80548092944 archived-at: Wed, 08 Feb 2017 18:38:02 -0000 --94eb2c1917c07afcf80548092944 Content-Type: text/plain; charset=UTF-8 Hello Community, We are currently working on integrating avro into our products. We are exciting about avro's schema evolution feature however we found in avro-cpp library it didn't support it very well in old versions (1.7.0 which are what we used before) I found some JIRAs(AVRO-1360 & AVRO-1474 ) online which indicate some schema-evolution bugs has been fixed since 1.7.7 so we upgrade our avro-cpp library from 1.7.0 to latest 1.8.1, it did resolve the problem in old avro-cpp however we found it breaks some our existing tests after upgrade which seems like a regression. According to the avro-cpp spec (http://avro.apache.org/docs/1.7.7/spec.html). Since 1.7.7, it added a new note in Union type section: *"Note that when a default value is specified for a record field whose type is a union, the type of the default value must match the first element of the union. Thus, for unions containing "null", the "null" is usually listed first, since the default value of such unions is typically null." * Based on the description, union type like ["null","string"] should only have default value "deafult:null", and this works fine in 1.8.1. While other unions like ["string","null"] should have default value like "default:"test"", this failed in latest version when it trying construct a avro schema object from string. It also failed for other similar cases like ["int","null"] or ["float","null']. I have divided into the avro-cpp source code a little bit. In the failed case it seems hit an assert to force dafault value type is always json::etObject if it is not json::etNull; which for me, it seems not always correct, the default value type can be string or int or whatever as long as it matches the first element type in the union according to the spec. avro-cpp-1.8.1\impl\Compiler.cc: 282: case AVRO_UNION: 283: { 284: GenericUnion result(n); 285: string name; 286: Entity e2; 287: if (e.type() == json::etNull) { 288: name = "null"; 289: e2 = e; 290: } else { 291: assertType(e, json::etObject); 292: const map& v = e.objectValue(); 293: if (v.size() != 1) { 294: throw Exception(boost::format("Default value for " 295: "union has more than one field: %1%") % e.toString()); 296: } 297: map::const_iterator it = v.begin(); 298: name = it->first; 299: e2 = it->second; 300: } it seems all the codes above has been added in svn revision *1606545 * by @thiru to fix JIRA AVRO-1474 I mentioned above. https://svn.apache.org/viewvc/avro/trunk/lang/c%2B%2B/impl/Compiler.cc?view=log&sortby=date&pathrev=1606545 I have already create a minimal repo which can constantly reproduce this problem, can I file a Jira to track this problem? I will attach my repo there. Thank you very much! Yang --94eb2c1917c07afcf80548092944 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hello Community,

We are currently worki= ng on integrating avro into our products. We are exciting about avro's = schema evolution feature however we found in avro-cpp library it didn't= support it very well in old versions (1.7.0 which are what we used before)=

I found some JIRAs(AVRO-1360=C2=A0&=C2=A0AVRO-1474= ) online which indicate some schema-evolution bugs has been fixed since 1.7= .7 so we upgrade our avro-cpp library from 1.7.0 to latest 1.8.1, it did re= solve the problem in old avro-cpp however we found it breaks some our exist= ing tests after upgrade which seems like a regression.

=
According to the avro-cpp spec (http://avro.apache.org/docs/1.7.7/spec.html). Since 1= .7.7, it added a new note in Union type section: "Note that when a=C2= =A0de= fault value=C2=A0is specified for a record field whose type is a union, = the type of the default value must match the=C2=A0first=C2=A0elemen= t of the union. Thus, for unions containing "null", the "nul= l" is usually listed first, since the default value of such unions is = typically null."=C2=A0

Based on the description, union type like ["null","string= "] should only have default value "deafult:null", and this w= orks fine in 1.8.1. While other unions like ["string","null&= quot;] should have default value like "default:"test"",= this failed in latest version when it trying construct a avro schema objec= t from string. It also failed for other similar cases like ["int"= ,"null"] or ["float","null'].

I have divided into the avro-cpp source code a little bit. In the = failed case it seems hit an assert to force dafault value type is always js= on::etObject if it is not json::etNull; which for me, it seems not always c= orrect, the default value type can be string or int or whatever as long as = it matches the first element type in the union according to the spec.
=

avro-cpp-1.8.1\impl\Compiler.cc:
282: = =C2=A0 =C2=A0 case AVRO_UNION:
283: =C2=A0 =C2=A0 {
284= : =C2=A0 =C2=A0 =C2=A0 =C2=A0 GenericUnion result(n);
285: =C2=A0= =C2=A0 =C2=A0 =C2=A0 string name;
286: =C2=A0 =C2=A0 =C2=A0 =C2= =A0 Entity e2;
287: =C2=A0 =C2=A0 =C2=A0 = =C2=A0 if (e.type() =3D=3D json::etNull) {
288: =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 name =3D "null";
289: = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 e2 =3D e;
290: =C2=A0 = =C2=A0 =C2=A0 =C2=A0 } else {
291: =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 assertType(e, json::etObject);
292: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 const map<string= , Entity>& v =3D e.objectValue();
293: =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 if (v.size() !=3D 1) {
294: =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 throw Exception(boost::format= ("Default value for "
295: =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "union has more than one fie= ld: %1%") % e.toString());
296: =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 }
297: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ma= p<string, Entity>::const_iterator it =3D v.begin();
298: = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 name =3D it->first;
= 299: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 e2 =3D it->second;
<= div>300: =C2=A0 =C2=A0 =C2=A0 =C2=A0 }

it se= ems all the codes above has been added in svn revision=C2=A01606545=C2=A0by @thiru to fix JIRA =C2=A0AVRO-1474 I mentioned above. = https://svn.apache.org/= viewvc/avro/trunk/lang/c%2B%2B/impl/Compiler.cc?view=3Dlog&sortby=3Ddat= e&pathrev=3D1606545


I have already create a minimal repo which can constantly reproduce = this problem, can I file a Jira to track this problem? I will attach my rep= o there.


Thank you very much!
=
Yang
--94eb2c1917c07afcf80548092944--