nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Wing <jvw...@gmail.com>
Subject Re: Pulling API Endpoints into Kafka Topics in Avro
Date Tue, 28 Mar 2017 16:51:32 GMT
Steve,

The inferred schemas can be helpful to get you started, but I recommend
providing your own Avro schema based on your knowledge of what should be
guaranteed to downstream systems.  If you want to pass untyped data, you
can't really beat JSON.  Avro schema isn't so bad, honest.

As part of the numeric key issue, I think your snippet above suggests that
the keys are not fixed in each sample?  It might be covered by using an
Avro "map" type rather than a "record":

{
    "type": "record",
    "name": "testrecord",
    "fields": [
        {
            "name": "metricsPerAgent",
            "type": {
                "type": "map",
                "values": {
                    "type": "record",
                    "name": "agentMetrics",
                    "fields": [
                        {
                            "name": "connectedEngagements",
                            "type": "long"
                        },
                        {
                            "name": "nonInteractiveTotalHandlingTime",
                            "type": "long"
                        }
                    ]
                }
            }
        }
    ]
}

Thanks,

James



On Tue, Mar 28, 2017 at 7:24 AM, Steve Champagne <champagst@gmail.com>
wrote:

> I'm in the process of creating an ingest workflow that will pull into
> Kafka topics a number of API endpoints on an hourly basis. I'd like convert
> them from JSON to AVRO when I bring them in. I have, however, run into a
> few problems that I haven't been able to figure out and haven't turned
> anything up through searches. This seems like it would be a fairly common
> use case of NiFi, so I figured I'd ask around to see what others are doing
> in these cases.
>
> The first problem that I'm running into is that some of the endpoints have
> objects of the form:
>
> {
>   "metricsPerAgent": {
>     "6453": {
>       "connectedEngagements": 3,
>       "nonInteractiveTotalHandlingTime": 0
>     },
>     "6454": {
>       "connectedEngagements": 1,
>       "nonInteractiveTotalHandlingTime": 0
>     }
>   }
> }
>
> I'm using an UpdateAttribute processor to add a schema that I get from
> running the object through the InferAvroSchema processor and running the
> flowfile into a ConvertJSONToAvro processor. There, unfortunately, I'm
> getting an error with the ConvertJSONToAvro processor not liking the field
> names being numbers. What do people normally do in cases like these?
>
> Thanks!
>

Mime
View raw message