nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Doran <>
Subject Re: Correlate Processor ID in Logs
Date Tue, 22 Aug 2017 20:43:16 GMT
Hi Karthik,

A processor's metadata, including its name and parent processor group ID, are accessible via
the NiFi REST API [1] via GET /processors/{id}, which returns: 

"component": {
	    "id": "value",
	    "parentGroupId": "value",
	    "name": "value",
	    "type": "value",
	   ... }

Of course, hitting the API for every log line doesn't scale, so one approach would be to build
a local cache of processorId -> processorMetadata in whatever log line processing tool
you are using, and use the cache in order to enrich each log line with the fields you require.
You could build the cache lazily, i.e., start with an empty lookup table, and if the processor
ID is not in the cache, hit the REST API to look it up.



On 8/22/17, 15:56, "Karthik Kothareddy (karthikk) [CONT - Type 2]" <>

    Hello All,
    I am trying to build a monitoring mechanism for our flows and I'm considering using the
"nifi-app.log" as a primary source and filter them based on the messages. However, I see that
a particular message only has Processor name and ID for example,
    ERROR [Timer-Driven Process Thread-36] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=015a1007-548f-1bf5-1836-e4e53164d184]
Unable to execute SQL select query SELECT * FROM table WHERE comp_datetime <= '2017-01-31
23:59:59.813' ORDER BY datetime OFFSET 324000000 ROWS FETCH NEXT 1000000 ROWS ONLY for StandardFlowFileRecord[uuid=fc425c66-b83d-46d2-94bc-332e43345960,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1499803802779-112000, container=default, section=384],
offset=265042, length=114613],offset=53992,name=16290968101533439,size=167]
    Given the above Error message it is really hard to correlate the ProcessorName/ID to the
actual name of the Processor or it's parent ProcessorGroup. Is there a way that I can correlate
them easily?
    Also , I have considered using Bulletins as the source which is more fine grained to the
actual processor and ProcessorGroup it belongs to but problem with this approach is the rest
call only returns 5 bulletins back each time. And according to this post
 it is a fixed value and practically not feasible to capture all of them if the flow has multiple
failures every second.
    Any thoughts around this are much appreciated.

View raw message