Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2C411200D26 for ; Fri, 20 Oct 2017 18:48:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2AD83160BCB; Fri, 20 Oct 2017 16:48:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 524E61609ED for ; Fri, 20 Oct 2017 18:47:59 +0200 (CEST) Received: (qmail 59584 invoked by uid 500); 20 Oct 2017 16:47:58 -0000 Mailing-List: contact dev-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@avro.apache.org Delivered-To: mailing list dev@avro.apache.org Received: (qmail 59571 invoked by uid 99); 20 Oct 2017 16:47:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Oct 2017 16:47:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 5CB461A0C06 for ; Fri, 20 Oct 2017 16:47:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.679 X-Spam-Level: * X-Spam-Status: No, score=1.679 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=netflix.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 1pkWkISBB1UN for ; Fri, 20 Oct 2017 16:47:54 +0000 (UTC) Received: from mail-qt0-f181.google.com (mail-qt0-f181.google.com [209.85.216.181]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 4B79C5FD8E for ; Fri, 20 Oct 2017 16:47:54 +0000 (UTC) Received: by mail-qt0-f181.google.com with SMTP id 31so19183515qtz.9 for ; Fri, 20 Oct 2017 09:47:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to; bh=DaKulek8J9MZe5uENfpUm50vS+R8P7yR2bzHVVilBNo=; b=CtvqG8RIvR7zIBL8af/FUPq+ufNE9yIgqd5A8YeTljlzqmRnKxvROItaetxe7BapmV S3nhhacUPgb8Y+YkjPweSSl11sBm3qlo/3mdk7GfXerdPs/Xa74ZknjWc2QJrrGqZJIC nJRtbq9/Mx4z6lGOcN/Vybd/NqCui4eKC4Dc4PdcbxSD6db1Taudps5jjjEkYKOg3V6G Ln+fhbr8m//pOv4aa0wGzw8RsMNblfgCArq6oMSGdA0Viko80LnVolcOQzSVhDSli0HJ rPC6egObQu39Hp/auVjzYhqbLcTLLEU9UlJkXPdtX8B81TZZlEwPecGDy7sK6DLGtclK BtAA== X-Gm-Message-State: AMCzsaVe6JsJpi6mOzDopGoEJozWgs48UIhWFGHWc4Uw+00JTk7Aclc3 wPNkPlxCU5eLMNoOFiRBnwCTSEZ7N4lsLytzRgBUx9x9Zj8= X-Google-Smtp-Source: ABhQp+Tec02/XN2Nb+PWBPCSLReM7vnKcynC/8jNYo3k7wWO6nk614KYzWQ5jE/0cTQFY9lHE0cAyXM7yUuSCXI6mMg= X-Received: by 10.200.42.2 with SMTP id k2mr8321063qtk.86.1508518072896; Fri, 20 Oct 2017 09:47:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.148.234 with HTTP; Fri, 20 Oct 2017 09:47:22 -0700 (PDT) Reply-To: rblue@netflix.com In-Reply-To: References: From: Ryan Blue Date: Fri, 20 Oct 2017 09:47:22 -0700 Message-ID: Subject: Re: (Default) values for logical types in human-readable form To: Avro Dev List Content-Type: multipart/alternative; boundary="001a113f4d68df6774055bfd3bf7" archived-at: Fri, 20 Oct 2017 16:48:00 -0000 --001a113f4d68df6774055bfd3bf7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > So if I understand correctly, you support the idea of human-readable defaults but not logical-type-dependent interpretation. I don't see how we could achieve the first without the second, since different logical types have different human-readable representations. So it seems that the optional nature of logical types actually makes this feature impossible. Sorry, I can see how that was confusing. I would use the logical type to determine the transform, but wouldn't require the user to have configured a conversion for the type, which is optional. Basically, I'm saying that this feature would support decimal, date, time, and timestamp from string. The conversion from string would happen when parsing a schema and would set the "default" field from another string-based field, if "default" doesn't already exist. That way, we always have the "default" that all readers use. For example, instead of defining {"name":"d","type":"fixed","logical-type":"decimal",...,"default":= "\u000C\u006C"}, you would use {"name":"d","type":"fixed",...,"string-default":"31.80"} that gets translated when parsed to the one with a "default" field. I think that would work in all cases. On the subject of AVDL, I think we clearly have a case where people are editing schemas directly so it makes sense to support this. We can also extend IDL to use an annotation or something that bakes down to "string-default" in the schema. I'm not very familiar with the IDL, though, so I can't say exactly what we would need to do here. rb On Fri, Oct 20, 2017 at 6:08 AM, Bridger Howell wrote: > On Fri, Oct 20, 2017 at 2:04 AM, Fr=C3=A9d=C3=A9ric SOUCHU < > Frederic.SOUCHU@ingenico.com> wrote: > > > In line with Philip Zeyliger on IDL being a good tool for a human to > > produce schema. > > Key features (IMHO): > > - support for includes (killer feature) > > - simpler syntax (a *lot* less '{' and '['...) > > - simpler comments syntax > > > > I'm afraid you're missing my point. I'm not arguing that IDL isn't a "goo= d" > tool for producing schemas. I'm arguing that I don't think, as it is, it > should be the preferred tool for writing schemas. > > Implicitly, I'm also extending that to mean we shouldn't currently prefer > to give new features like this only to IDL, unless we want to make the > process for using IDL for schemas cleaner and simpler. If we're willing t= o > do that, then I have no issues. > > I have a toolchain going from IDL to Java + C# classes that wouldn't work > > using JSON schema (the many holes in the AVRO C# side not helping > either..). > > > > Cool. I helped build something similar and I work with others who use it > regularly. > > > > (btw, how did we end up with different json/IDL logical names?!?!) > > > > I assume you're referring to the keywords like "timestamp_ms" and "date" > added to IDL to refer to the "timestamp-millis" and "date" logical types? > > These are special keywords, not a general mechanism that produces schemas > with the given logical type so there's no particular reason that they hav= e > to match the logical type that they implement (although it does seem > inconsistent). I looked around AVRO-1684 ( > https://issues.apache.org/jira/browse/AVRO-1684) where this was > implemented > for some justification, but I didn't find anything. > > Suggest to have an 'encoding' attribute to indicate how the default value > > is defined. > > { > > "type": "bytes", > > "logicalType": "decimal", > > "precision": 4, > > "scale": 2, > > "default":"3.151351351", > > "default-encoding":"string" // default encoding being 'AVRO' default > > (e.g. binary) > > } > > > > This has the same problem as one of my earlier suggestions; if you change > the meaning of the default field, then older readers will read the schema > with an incorrect default value. > > - Bridger Howell > > -- > > > The information contained in this email message is PRIVATE and intended > only for the personal and confidential use of the recipient named above. = If > the reader of this message is not the intended recipient or an agent > responsible for delivering it to the intended recipient, you are hereby > notified that you have received this message in error and that any review= , > dissemination, distribution or copying of this message is strictly > prohibited. If you have received this communication in error, please > notify us immediately by email, and delete the original message. > --=20 Ryan Blue Software Engineer Netflix --001a113f4d68df6774055bfd3bf7--