Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DB100200D27 for ; Wed, 25 Oct 2017 20:34:27 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D9632160BDA; Wed, 25 Oct 2017 18:34:27 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A98701609CE for ; Wed, 25 Oct 2017 20:34:26 +0200 (CEST) Received: (qmail 1238 invoked by uid 500); 25 Oct 2017 18:34:25 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 1228 invoked by uid 99); 25 Oct 2017 18:34:25 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Oct 2017 18:34:25 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C89EF1807E6 for ; Wed, 25 Oct 2017 18:34:24 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Llvii5k7kFX8 for ; Wed, 25 Oct 2017 18:34:22 +0000 (UTC) Received: from mail-oi0-f42.google.com (mail-oi0-f42.google.com [209.85.218.42]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 58DA25FB9F for ; Wed, 25 Oct 2017 18:34:22 +0000 (UTC) Received: by mail-oi0-f42.google.com with SMTP id a132so1617191oih.11 for ; Wed, 25 Oct 2017 11:34:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=M7YDQJTLIchG7+cUxm6xWXkDHWFw6Ep5tuRDPTDMEtg=; b=jfCFoXvgBPo6QHKXPeytVxIn45SuTibJt+51XbiSEYij6Qgs/y1+Ag1AmoufRGdQc/ GuDRdSpQmd5/xobDd5F4Ok6lP9H3hOCH/5hI0pPkgTdXUbnCS97deRlSGPRu9zghSt6G Vz614AXxgtKprNClU7xXMOFaacL/PogqiYUI3N8qH8MJWW5AYlBLVJ6UuunyU/3ph+et som63FfZrqCLayTIlt001qW4c/R7rk7DbmGSfnEYq5XsHXIry8UjNe2Zn62jPGCq5Ssv 58hF2bhQyR6HptfzpeRHXuCAtJmkU2NaMw+mH+ekqcoe7ZL3VMhEITO9Zwh+Z8duJe3E kw7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=M7YDQJTLIchG7+cUxm6xWXkDHWFw6Ep5tuRDPTDMEtg=; b=ER2WEdLDmcYd+RN3m14JWBfHNcuZT/CbqRH9jVCd0g2XWAKo91ClZPWULm+ekaMmgz 7lZ+F3dmAgbIzzVWqXF57tVAYLqeIzwYtqhpPQ+d4pcEgDqhV+og5gHrakkGQfuh4nzW OZb5J3LapE1kFTy0iWFJkhPPLfZLBH0SXwuPcsAppPIisgbZse5Lya7Y4P8ebcs1DOaT r8T4EZe7tQ02/2fufPhveigCXQkPf0jFnij2pGKODKe8mSd4Z06NEvQEB1ss1dWGxVgm jra1sMMvOOuTOUOqhKUnvtsAQHnKjaq9FvETbV8l+ulwgWgN03zEHIdB79yq2Ha2wQe9 nODw== X-Gm-Message-State: AMCzsaU2boVrdntpA3I0bUoq/b7sRvlpyT9wO9FTaXEvYV+iBLce9H7d fNk0FZpcw/4GT6bjFtvo2RDKebpo9pN4sSKCDBs= X-Google-Smtp-Source: ABhQp+RXc2zmAwzZ1ZT79IYT3pAtgrrDUag72jNm/bpt8Awr44qdsv1SL8bGbRMSzUWCWHj9rZI8xaO54jQ5jM3i9zI= X-Received: by 10.157.42.165 with SMTP id e34mr1540832otb.344.1508956461478; Wed, 25 Oct 2017 11:34:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.74.188.137 with HTTP; Wed, 25 Oct 2017 11:34:20 -0700 (PDT) In-Reply-To: References: From: Mike Thomsen Date: Wed, 25 Oct 2017 14:34:20 -0400 Message-ID: Subject: Re: Is it possible to use $ characters in field names? To: user@avro.apache.org Content-Type: multipart/alternative; boundary="94eb2c11f014de39d2055c634d8b" archived-at: Wed, 25 Oct 2017 18:34:28 -0000 --94eb2c11f014de39d2055c634d8b Content-Type: text/plain; charset="UTF-8" It looks like I might have been over thinking this, as there is a NiFi Record API capability for handling Date objects that looks like it might be able to sidestep it entirely by converting a string into its representation. I'll explore that route, and if it works will try to follow up with findings in the off chance someone else goes down this path. On Wed, Oct 25, 2017 at 2:21 PM, Mike Thomsen wrote: > The problem is actually with that processor. When I wrote it, I used a > naive approach to reading the records and turning them into Mongo Document > objects. > > Now what COULD work is if I could use the "date" logical type to create an > Avro date that could return a java.util.Date object. Mongo's client API > will not have a problem with that. > > I'll take this over to nifi-dev to see what others think. > > Thanks. > > On Wed, Oct 25, 2017 at 12:08 PM, Sean Busbey wrote: > >> Shoot. my copying in the NiFi user list failed. Mike, if using the >> PutMongoRecord processor might work, the folks on that list are more likely >> to be able to help with edge cases. >> >> If you need the intermediate JSON for some reason, I think there's a JSON >> transforming processor that you could maybe use to rewrite the JSON records >> with the right field name? >> >> On Wed, Oct 25, 2017 at 11:05 AM, Sean Busbey >> wrote: >> >>> +users@nifi.apache.org[1] >>> >>> Could you can keep the data in Avro and then use Nifi's PutMongoRecord >>> processor[2] with an AvroReader to insert? >>> >>> >>> [1]: https://lists.apache.org/list.html?users@nifi.apache.org >>> [2]: https://s.apache.org/MmPG >>> >>> On Wed, Oct 25, 2017 at 7:51 AM, Mike Thomsen >>> wrote: >>> >>>> No, it doesn't look like it's going to work. It accepts $date into the >>>> record using the alias, but it doesn't generate $date as the field name >>>> when writing the object back to JSON. >>>> >>>> On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar >>>> wrote: >>>> >>>>> Oh yes, you're right, you face with the limitation of field names >>>>> . Apart from >>>>> solving this via a map, you might consider using Avro aliases >>>>> , since looks >>>>> like aliases don't have this limitation, can you use them? >>>>> >>>>> Nandor >>>>> >>>>> On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen >>>>> wrote: >>>>> >>>>>> Hi Nandor, >>>>>> >>>>>> It's not the numeric portion that is the problem for me, but the >>>>>> $date field name. Mongo apparently requires the structure I provided in the >>>>>> example, and whenever I use $date as the field name the Java Avro API >>>>>> throws an exception about an invalid character in the field definition. >>>>>> >>>>>> The logical type thing is good to know for future reference. >>>>>> >>>>>> I admit that this is likely a really uncommon edge case for Avro. The >>>>>> work around I found for defining a schema that is at least compatible with >>>>>> the Mongo Extended JSON requirements was to do this (one field example): >>>>>> >>>>>> { >>>>>> "namespace": "test", >>>>>> "name": "PutTestRecord", >>>>>> "type": "record", >>>>>> "fields": [{ >>>>>> "name": "timestampField", >>>>>> "type": { >>>>>> "type": "map", >>>>>> "values": "long" >>>>>> } >>>>>> }] >>>>>> } >>>>>> >>>>>> It doesn't give you the full validation that would be ideal if we >>>>>> could define a field with the name "$date," but it's an 80% solution that >>>>>> works with NiFi and other tools that have to generate Extended JSON for >>>>>> Mongo. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Mike >>>>>> >>>>>> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar >>>>>> wrote: >>>>>> >>>>>>> Hi Mike, >>>>>>> >>>>>>> This JSON doesn't seems like a valid Avro schema >>>>>>> . If you'd >>>>>>> like to use timestamps in your schema, you should use Timestamp >>>>>>> logical types, >>>>>>> >>>>>>> which annotate Avro longs. In this case the schema of this field should >>>>>>> look like this: >>>>>>> >>>>>>> { >>>>>>> "name":"timestamp", >>>>>>> "type":"long", >>>>>>> "logicalType":"timestamp-millis" >>>>>>> } >>>>>>> >>>>>>> If you'd like to create Avro files with this schema, there's on Avro >>>>>>> wiki you can find a brief tutorial >>>>>>> >>>>>>> how to create and write Avro files with this schema in Java. >>>>>>> >>>>>>> Regards, >>>>>>> Nandor >>>>>>> >>>>>>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen < >>>>>>> mikerthomsen@gmail.com> wrote: >>>>>>> >>>>>>>> I am trying to build an avro schema for a NiFi flow that is going >>>>>>>> to insert data into Mongo, and Mongo extended JSON requires the use of $ >>>>>>>> characters in cases like this (to represent a date): >>>>>>>> >>>>>>>> { >>>>>>>> "timestamp": { >>>>>>>> "$date": TIMESTAMP_LONG_HERE >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> I tried building a schema with that, and it failed saying there was >>>>>>>> an invalid character in the schema. just wanted to check and see if there >>>>>>>> was a work around for this or if I'll have to choose another option. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> busbey >>> >> >> >> >> -- >> busbey >> > > --94eb2c11f014de39d2055c634d8b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
It looks like I might have been over thinking this, as the= re is a NiFi Record API capability for handling Date objects that looks lik= e it might be able to sidestep it entirely by converting a string into its = representation. I'll explore that route, and if it works will try to fo= llow up with findings in the off chance someone else goes down this path.

On Wed, O= ct 25, 2017 at 2:21 PM, Mike Thomsen <mikerthomsen@gmail.com><= /span> wrote:
=
The problem is actually with that processor. When I wrote it, I used a= naive approach to reading the records and turning them into Mongo Document= objects.

Now what COULD work is if I could use the "date= " logical type to create an Avro date that could return a java.util.Da= te object. Mongo's client API will not have a problem with that.
I'll take this over to nifi-dev to see what others think.
Thanks.

On Wed, Oct 25, 2017 at 12:= 08 PM, Sean Busbey <busbey@cloudera.com> wrote:
Shoot. my copying in the NiFi user= list failed. Mike, if using the PutMongoRecord processor might work, the f= olks on that list are more likely to be able to help with edge cases.
<= br>
If you need the intermediate JSON for some reason, I think th= ere's a JSON transforming processor that you could maybe use to rewrite= the JSON records with the right field name?

On Wed, Oct 25, 2017 at 11:05 AM, Sean Busbey &l= t;busbey@cloudera.= com> wrote:

Could you can keep the data in Avr= o and then use Nifi's PutMongoRecord processor[2] with an AvroReader to= insert?



On Wed, Oct 25, 2017 at 7:51 AM, M= ike Thomsen <mikerthomsen@gmail.com> wrote:
No, it doesn't look like it'= ;s going to work. It accepts $date into the record using the alias, but it = doesn't generate $date as the field name when writing the object back t= o JSON.

On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar <nko= llar@cloudera.com> wrote:
<= div dir=3D"ltr">Oh yes, you're right, you face with the limitation of <= a href=3D"https://avro.apache.org/docs/1.8.0/spec.html#names" target=3D"_bl= ank">field names. Apart from solving this via a map, you might consider= using Avro aliases, since looks like aliases don't have th= is limitation, can you use them?

Nandor

On Wed, Oct 25, 2017 at 1:40 PM, Mike = Thomsen <mikerthomsen@gmail.com> wrote:
Hi Nandor,

<= /div>It's not the numeric portion that is the problem for me, but the $= date field name. Mongo apparently requires the structure I provided in the = example, and whenever I use $date as the field name the Java Avro API throw= s an exception about an invalid character in the field definition.

<= /div>The logical type thing is good to know for future reference.
=

I admit that this is likely a really uncommon edge case= for Avro. The work around I found for defining a schema that is at least c= ompatible with the Mongo Extended JSON requirements was to do this (one fie= ld example):

{
=C2=A0=C2=A0=C2=A0 "namespa= ce": "test",
=C2=A0=C2=A0=C2=A0 "name": "P= utTestRecord",
=C2=A0=C2=A0=C2=A0 "type": "record&qu= ot;,
=C2=A0=C2=A0=C2=A0 "fields": [{
=C2=A0=C2=A0=C2=A0 =C2= =A0=C2=A0=C2=A0 "name": "timestampField",
=C2=A0=C2= =A0=C2=A0 =C2=A0=C2=A0=C2=A0 "type": {
=C2=A0=C2=A0=C2=A0 =C2= =A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 "type": "map",
= =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 "values"= : "long"
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 }
=C2=A0=C2= =A0=C2=A0 }]
}

It doesn't give you the full= validation that would be ideal if we could define a field with the name &q= uot;$date," but it's an 80% solution that works with NiFi and othe= r tools that have to generate Extended JSON for Mongo.

Thanks,

Mike

On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <<= a href=3D"mailto:nkollar@cloudera.com" target=3D"_blank">nkollar@cloudera.c= om> wrote:
Hi Mike,

This JSON doesn't seems like a valid Avro schema. If you'd like to use timestamps in your schema, you= should use Timestamp logical types,<= /a> which annotate Avro longs. In this case the schema of this field should= look like this:


Regards,
Nandor

On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <mikerthomsen@gmail.com> wrote:
I am trying to build an avro schema for a = NiFi flow that is going to insert data into Mongo, and Mongo extended JSON = requires the use of $ characters in cases like this (to represent a date):<= br>
{
=C2=A0=C2=A0=C2=A0 "timestamp": {
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "$date": TIMESTAMP_LONG_H= ERE
=C2=A0=C2=A0=C2=A0 }
}

I tried= building a schema with that, and it failed saying there was an invalid cha= racter in the schema.=C2=A0 just wanted to check and see if there was a wor= k around for this or if I'll have to choose another option.
<= br>
Thanks,

Mike







<= /div>--
busbey



--
busbey


--94eb2c11f014de39d2055c634d8b--