avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Thomsen <mikerthom...@gmail.com>
Subject Re: Is it possible to use $ characters in field names?
Date Wed, 25 Oct 2017 18:34:20 GMT
It looks like I might have been over thinking this, as there is a NiFi
Record API capability for handling Date objects that looks like it might be
able to sidestep it entirely by converting a string into its
representation. I'll explore that route, and if it works will try to follow
up with findings in the off chance someone else goes down this path.

On Wed, Oct 25, 2017 at 2:21 PM, Mike Thomsen <mikerthomsen@gmail.com>
wrote:

> The problem is actually with that processor. When I wrote it, I used a
> naive approach to reading the records and turning them into Mongo Document
> objects.
>
> Now what COULD work is if I could use the "date" logical type to create an
> Avro date that could return a java.util.Date object. Mongo's client API
> will not have a problem with that.
>
> I'll take this over to nifi-dev to see what others think.
>
> Thanks.
>
> On Wed, Oct 25, 2017 at 12:08 PM, Sean Busbey <busbey@cloudera.com> wrote:
>
>> Shoot. my copying in the NiFi user list failed. Mike, if using the
>> PutMongoRecord processor might work, the folks on that list are more likely
>> to be able to help with edge cases.
>>
>> If you need the intermediate JSON for some reason, I think there's a JSON
>> transforming processor that you could maybe use to rewrite the JSON records
>> with the right field name?
>>
>> On Wed, Oct 25, 2017 at 11:05 AM, Sean Busbey <busbey@cloudera.com>
>> wrote:
>>
>>> +users@nifi.apache.org[1]
>>>
>>> Could you can keep the data in Avro and then use Nifi's PutMongoRecord
>>> processor[2] with an AvroReader to insert?
>>>
>>>
>>> [1]: https://lists.apache.org/list.html?users@nifi.apache.org
>>> [2]: https://s.apache.org/MmPG
>>>
>>> On Wed, Oct 25, 2017 at 7:51 AM, Mike Thomsen <mikerthomsen@gmail.com>
>>> wrote:
>>>
>>>> No, it doesn't look like it's going to work. It accepts $date into the
>>>> record using the alias, but it doesn't generate $date as the field name
>>>> when writing the object back to JSON.
>>>>
>>>> On Wed, Oct 25, 2017 at 8:19 AM, Nandor Kollar <nkollar@cloudera.com>
>>>> wrote:
>>>>
>>>>> Oh yes, you're right, you face with the limitation of field names
>>>>> <https://avro.apache.org/docs/1.8.0/spec.html#names>. Apart from
>>>>> solving this via a map, you might consider using Avro aliases
>>>>> <https://avro.apache.org/docs/1.8.2/spec.html#Aliases>, since looks
>>>>> like aliases don't have this limitation, can you use them?
>>>>>
>>>>> Nandor
>>>>>
>>>>> On Wed, Oct 25, 2017 at 1:40 PM, Mike Thomsen <mikerthomsen@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Nandor,
>>>>>>
>>>>>> It's not the numeric portion that is the problem for me, but the
>>>>>> $date field name. Mongo apparently requires the structure I provided
in the
>>>>>> example, and whenever I use $date as the field name the Java Avro
API
>>>>>> throws an exception about an invalid character in the field definition.
>>>>>>
>>>>>> The logical type thing is good to know for future reference.
>>>>>>
>>>>>> I admit that this is likely a really uncommon edge case for Avro.
The
>>>>>> work around I found for defining a schema that is at least compatible
with
>>>>>> the Mongo Extended JSON requirements was to do this (one field example):
>>>>>>
>>>>>> {
>>>>>>     "namespace": "test",
>>>>>>     "name": "PutTestRecord",
>>>>>>     "type": "record",
>>>>>>     "fields": [{
>>>>>>         "name": "timestampField",
>>>>>>         "type": {
>>>>>>             "type": "map",
>>>>>>             "values": "long"
>>>>>>         }
>>>>>>     }]
>>>>>> }
>>>>>>
>>>>>> It doesn't give you the full validation that would be ideal if we
>>>>>> could define a field with the name "$date," but it's an 80% solution
that
>>>>>> works with NiFi and other tools that have to generate Extended JSON
for
>>>>>> Mongo.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> On Wed, Oct 25, 2017 at 4:48 AM, Nandor Kollar <nkollar@cloudera.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mike,
>>>>>>>
>>>>>>> This JSON doesn't seems like a valid Avro schema
>>>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#schemas>.
If you'd
>>>>>>> like to use timestamps in your schema, you should use Timestamp
>>>>>>> logical types,
>>>>>>> <https://avro.apache.org/docs/1.8.1/spec.html#Timestamp+%28millisecond+precision%29>
>>>>>>> which annotate Avro longs. In this case the schema of this field
should
>>>>>>> look like this:
>>>>>>>
>>>>>>> {
>>>>>>>    "name":"timestamp",
>>>>>>>    "type":"long",
>>>>>>>    "logicalType":"timestamp-millis"
>>>>>>> }
>>>>>>>
>>>>>>> If you'd like to create Avro files with this schema, there's
on Avro
>>>>>>> wiki you can find a brief tutorial
>>>>>>> <https://avro.apache.org/docs/1.8.1/gettingstartedjava.html#Compiling+the+schema>
>>>>>>> how to create and write Avro files with this schema in Java.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nandor
>>>>>>>
>>>>>>> On Tue, Oct 24, 2017 at 8:18 PM, Mike Thomsen <
>>>>>>> mikerthomsen@gmail.com> wrote:
>>>>>>>
>>>>>>>> I am trying to build an avro schema for a NiFi flow that
is going
>>>>>>>> to insert data into Mongo, and Mongo extended JSON requires
the use of $
>>>>>>>> characters in cases like this (to represent a date):
>>>>>>>>
>>>>>>>> {
>>>>>>>>     "timestamp": {
>>>>>>>>         "$date": TIMESTAMP_LONG_HERE
>>>>>>>>     }
>>>>>>>> }
>>>>>>>>
>>>>>>>> I tried building a schema with that, and it failed saying
there was
>>>>>>>> an invalid character in the schema.  just wanted to check
and see if there
>>>>>>>> was a work around for this or if I'll have to choose another
option.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Mike
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> busbey
>>>
>>
>>
>>
>> --
>> busbey
>>
>
>

Mime
View raw message