Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E8F7DD332 for ; Sat, 3 Nov 2012 16:49:20 +0000 (UTC) Received: (qmail 13955 invoked by uid 500); 3 Nov 2012 16:49:20 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 13903 invoked by uid 500); 3 Nov 2012 16:49:20 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 13884 invoked by uid 99); 3 Nov 2012 16:49:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 03 Nov 2012 16:49:20 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.220.171] (HELO mail-vc0-f171.google.com) (209.85.220.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 03 Nov 2012 16:49:13 +0000 Received: by mail-vc0-f171.google.com with SMTP id m18so6021648vcm.30 for ; Sat, 03 Nov 2012 09:48:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=lPqU2qr9h75tgo6l36BcPFhD4PtvFuR3LTgIjIODbtg=; b=XGZW1r9xIPogu11hralqvdZyQFSOUI+p3EzHdzBO/Q5EwiiKC8gdH1EWuse4T+4yQk EA8NLfmMP+3Cbn1QHuPlIZXaovKoUqj4DU3uyEjthV/uj+4OqNihm2WneQmgZBK2l4Qr vbvCOzM8c+jwZ38GFgnF44O00T/n3jbvZtaze/IhccMS0fNxCq2j9z7s5sbqa3OUcW6C uTi2UdzFu/S4xdCZaXvV3I8qxRNPpUeFP7/cGBBXCKWUDhTeb5be1ClnpgIHc+iqfPei mkxHFoE0YFvzYMiG9XuBHzL3ft2EhUCgjGHOfcz8PYSj4jLhkNKujPYpiBNuO+ZmsvAQ cuzQ== MIME-Version: 1.0 Received: by 10.58.15.227 with SMTP id a3mr5065996ved.38.1351961332441; Sat, 03 Nov 2012 09:48:52 -0700 (PDT) Received: by 10.58.32.129 with HTTP; Sat, 3 Nov 2012 09:48:52 -0700 (PDT) In-Reply-To: References: Date: Sat, 3 Nov 2012 09:48:52 -0700 Message-ID: Subject: Re: Schema validation of a field's default values From: Mark Hayes To: user@avro.apache.org Content-Type: multipart/alternative; boundary=047d7b5daf50f8640b04cd9a06cc X-Gm-Message-State: ALoCoQn6X0CACRVwL4yk0CS8sJUAmI3rBHnt16Wz8ZkMykwd3+1EFrqxb4ZC8mm02VkG8Z8p21DP X-Virus-Checked: Checked by ClamAV on apache.org --047d7b5daf50f8640b04cd9a06cc Content-Type: text/plain; charset=ISO-8859-1 On Mon, Oct 29, 2012 at 12:32 PM, Doug Cutting wrote: > No, I don't know of a default value validator that's been implemented > yet. It would be great to have one. > > I think this would recursively walk a schema. Whenever a non-null > default value is found it could call ResolvingGrammarDecoder#encode(). > That's what interprets Json default values. (Perhaps this logic > should be moved, though.) Thanks for the reply Doug. I did find ResolvingGrammarDecoder.encode (I saw that it is called by the builders) and was using it as you described, but I ran into limitations: + When the field type is an array, map or record, values of the wrong JSON type (not array or object) are translated to an empty array, map or record. For example, specifying a default of 0, null or "" results in an empty array, map or record. + For all numeric Avro types (int, long, float and double) the default value may be of any JSON numeric type, and the JSON values will be coerced to the Avro type in spite of the fact that part of the value may be lost/truncated. For example, a long default value that exceeds 32-bits will be truncated if the field is type int. + The byte array length is not validated for a fixed type. + For nested fields and certain types (e.g., enums) a cryptic error is often output that does not contain the name of the offending field. These deficiencies can mask errors made by the user when defining a default value. This is important to our application. To compensate for these deficiencies we implemented our own checking that is more strict than Avro's. To do this, we serialize the default value using our own JSON serializer in a special mode where default values are applied. Any errors during serialization indicate that the default value is invalid. Something similar might be done in Avro itself, for example, if the JSON encoder were made to operate in a special mode where default values are applied. --mark --047d7b5daf50f8640b04cd9a06cc Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Mon, Oct 29, 2012 at 12:32 PM, Doug Cutting <cutting@apache.org>= ; wrote: