Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D228EAA6 for ; Wed, 30 Jan 2013 23:18:40 +0000 (UTC) Received: (qmail 34536 invoked by uid 500); 30 Jan 2013 23:18:40 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 34487 invoked by uid 500); 30 Jan 2013 23:18:39 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 34479 invoked by uid 99); 30 Jan 2013 23:18:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jan 2013 23:18:39 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of akimball83@gmail.com designates 209.85.128.175 as permitted sender) Received: from [209.85.128.175] (HELO mail-ve0-f175.google.com) (209.85.128.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jan 2013 23:18:33 +0000 Received: by mail-ve0-f175.google.com with SMTP id db12so1544370veb.20 for ; Wed, 30 Jan 2013 15:18:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=KnBdEbDcpq01/tqZsbk2zxf2HVPZ4FqDZFR6+sfPxj4=; b=R0VKZZi4ccIjH3+e2g6cWxT6ct55sAt+aQ+pdNYcELd2w+YTDVpD0MelYHl2rPy1PQ kWEua3v63u7/GR4z6+N2rNB8YvD8sdfD1G7quWHjnJgGwdRS789H720fp5YQr5pa4MV6 y043Mb8zw5zo0mgOFRSQR8vbeAPEx3q+tFGO7NjZOtpe2P7B9rPp/wxm8Itcv3pNoVUZ lJamTUYR7jLKlxFWJCvMfjUraW4MrvktEXXg4nPviFB77P53VKbf/Sc0kqGrUIGhQWSe kuelpSReTrn/ofJN/DKZ5wZvfusPUxSf2i5HeFccTkxqczL0WNek+r20OsmkCRGd1997 kjZA== X-Received: by 10.52.89.106 with SMTP id bn10mr5612313vdb.68.1359587892469; Wed, 30 Jan 2013 15:18:12 -0800 (PST) MIME-Version: 1.0 Received: by 10.59.3.170 with HTTP; Wed, 30 Jan 2013 15:17:52 -0800 (PST) From: Aaron Kimball Date: Wed, 30 Jan 2013 15:17:52 -0800 Message-ID: Subject: static schema validation To: user@avro.apache.org Content-Type: multipart/alternative; boundary=20cf307f3a565f1e4304d489b952 X-Virus-Checked: Checked by ClamAV on apache.org --20cf307f3a565f1e4304d489b952 Content-Type: text/plain; charset=ISO-8859-1 Does Avro have an API to allow you to tell whether two schemas are a match, statically? i.e., schema1.canRead(schema2) /** return true iff schema1 can be used as a reader schema for schema2 */ >From my (admittedly cursorary) scan of the docs + source, it seems like there isn't something quite that concise, though maybe this can be accomplished using ResolvingGrammarGenerator? I'm pessimistic because of the following quote from the spec [1] *[matching] if both are unions:* The first schema in the reader's union that matches the selected writer's union schema is recursively resolved against it. if none match, an error is signalled. That sentence makes me think it's context dependent; I interpret "the selected writer's union schema" as "the schema of the actual thing written in a data buffer, which is one of the possible schemas the writer declared in her union type". i.e., you can only tell if schema R can be a reader for some other schema W in terms of a literal record written by W, and cannot be deduced statically for all possible records that can be encoded with schema W. Is this interpretation correct? If so, does anyone have any ideas how to ensure the best bounds on statically-guaranteed backward compatibility between a given reader and writer? Thanks, - Aaron [1] http://avro.apache.org/docs/current/spec.html#Schema+Resolution --20cf307f3a565f1e4304d489b952 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Does Avro have an API to allow you to tell whether two sch= emas are a match, statically?

i.e., schema1.canRea= d(schema2) /** return true iff schema1 can be used as a reader schema for s= chema2 */

From my (admittedly cursorary) scan of the = docs + source, it seems like there isn't something quite that concise, = though maybe this can be accomplished using ResolvingGrammarGenerator?

I'm pessimistic because of the fo= llowing quote from the spec [1]

<= b>[matching] if both are unions:
The first schema in the read= er's union that matches the selected writer's union schema is recur= sively resolved against it. if none match, an error is signalled.

That sentence makes me think it's context dep= endent; I interpret "the selected writer's union schema" as &= quot;the schema of the actual thing written in a data buffer, which is one = of the possible schemas the writer declared in her union type". i.e., = you can only tell if schema R can be a reader for some other schema W in te= rms of a literal record written by W, and cannot be deduced statically for = all possible records that can be encoded with schema W. =A0Is this interpre= tation correct? If so, does anyone have any ideas how to ensure the best bo= unds on statically-guaranteed backward compatibility between a given reader= and writer?

Thanks,
- Aaron

--20cf307f3a565f1e4304d489b952--