nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manish Gupta 8 <mgupt...@sapient.com>
Subject RE: Processor to enrich attribute from external service
Date Fri, 02 Sep 2016 22:47:46 GMT
I think the lookup processor should return data in a format that can be efficiently parsed/processed
by NiFi expression language. For example – JSON. This would avoid using additional “Extract”
type processor. All the downstream processor can simply work with “jsonPath” for additional
lookup inside the attribute.

Regards,
Manish

From: Matt Burgess [mailto:mattyb149@gmail.com]
Sent: Friday, September 02, 2016 6:37 PM
To: users@nifi.apache.org
Subject: Re: Processor to enrich attribute from external service

Manish,

Some of the queries in those processors could bring back lots of data, and putting them into
an attribute could cause memory issues. Another concern is when the result is binary data,
such as ExecuteSQL returning an Avro file. And since the return of these is a collection of
records, these processors are often followed by a Split processor to perform operations on
individual records.

Having said that, if the return value is text and you'd like to transfer it to an attribute,
you can use ExtractText to put the content into an attribute. For small content (which is
the appropriate use case), this should be pretty fast, and keeps the logic in a single processor
instead of duplicated (either logically or physically) across processors.

By the way I'm very interested in an RDBMS lookup processor, but not sure I'd have time in
the short run to write it up. If someone takes a crack at it, I recommend properties to pre-cache
the table with a refresh interval. This way if the lookup table doesn't change much and is
not too big, it could be read into the processor's memory for super-fast lookups. Alternatively,
a property could be a cache size, which would build a subset of the table in memory as values
are looked up. This is probably more robust as it is bounded and if the size is set high enough
for a small table, it would be read in its entirety. Still would want the cache refresh property
though.

Cheers,
Matt

On Sep 2, 2016, at 6:19 PM, Manish Gupta 8 <mgupta50@sapient.com<mailto:mgupta50@sapient.com>>
wrote:
Thanks for the reply Joe. Just a thought – do you think it would be a good idea for every
Get processor (GetMongo, GetHBase etc.) to have 2 additional properties like:

1.      Result in Content or Result in Attribute

2.      Result Attribute Name (only applicable when “Result in Attribute” is selected).
But then all such processors should be able to accept incoming flowfile (which they don’t
as of now – being a “Get”).

May be ExecuteSQL and FetchDistributeMapCache can be enhanced that way i.e. have an option
to specify the destination – content or attribute?

Regards,
Manish

From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: Friday, September 02, 2016 5:58 PM
To: users@nifi.apache.org<mailto:users@nifi.apache.org>
Subject: Re: Processor to enrich attribute from external service


You would need to make a custom process for now.  I think we should have a nice controller
service to generalize jdbc lookups which supports caching.  And then a processor which leverages
it.

This comes up fairly often and is pretty straightforward from a design POV.  Anyone want to
take a stab at this?

On Sep 2, 2016 4:47 PM, "Manish Gupta 8" <mgupta50@sapient.com<mailto:mgupta50@sapient.com>>
wrote:
Hello Everyone,

Is there a processor that we can use for updating/adding an attribute of an incoming flow
file from some external service (say MongoDB or Couchbase or any RDBMS)? The processor will
use the attribute of incoming flow file, query the external service, and simply modify/add
an additional attribute of flow-file (without touching the flow file content).

If we have to achieve this kind of “lookup” operation (but only to update attribute and
not the content), what are the options in NiFi?
Should we create a custom processor (may be by taking GetMongo processor and modifying its
code to update an attribute with query result)?

Thanks,
Manish

Mime
View raw message