nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <>
Subject Re: Atlas and NiFi integration help
Date Thu, 01 Mar 2018 15:20:35 GMT

As far as I know, Atlas is not really about "event level" lineage, it
is more about "system level" or "data set' level.

So I believe the goal of Atlas is to show how the systems are
connected and how a particular data set flows through the system.

So an example might be... NiFi pulls from source #1, then publishes to
Kafka topic #1,  and then a stream processing system consumes from
Kafka topic #1, and then writes results to Hive.

Atlas can then tell you that source #1 flowed through all these
systems and was the source for these results in Hive (something like

I don't think its a massive long-term store for event-level provenance
data like NiFi has, but others can chime in here if I am wrong.


On Thu, Mar 1, 2018 at 10:11 AM, Mike Thomsen <> wrote:
> So I tried again, and finally got something populated (screenshot attached
> for reference). What I don't see is anything like the provenance data that
> the processors store. Like nothing about the flowfiles, their attributes,
> etc.
> My goal here is to have a long term, searchable repository of provenance
> data so questions like "when was data set XYZ reindexed" can be answered. Is
> the flowfile provenance data not being captured and sent to Atlas or am I
> doing it wrong?
> If the answer is "not yet" I'm cool with that and would be happy to take a
> stab at expanding the scope of the reporting task's capabilities. I just
> need someone more knowledgeable on this integration to give me pointers.
> Thanks,
> Mike
> On Wed, Feb 28, 2018 at 2:43 PM, Mike Thomsen <>
> wrote:
>> Matt,
>> Yeah, I saw that pretty early on. Admittedly my question may be a bit
>> nebulous. What I'm trying to figure out is what I should be seeing in Atlas
>> if NiFi is sending it events properly. Since the integration and knowledge
>> around it is probably clustered here, I'm not sure I can go to the Atlas
>> list and ask them the same question.
>> Thanks,
>> Mike
>> On Wed, Feb 28, 2018 at 2:13 PM, Matt Burgess <>
>> wrote:
>>> Mike,
>>> There is a nifi-atlas-bundle in NiFi with a NAR that includes the
>>> ReportLineageToAtlas reporting task, but IIRC it is so large that it
>>> is not included in the default assembly. Instead there is a
>>> "include-atlas" profile that can be activated when building the
>>> assembly, and that should include the Atlas NAR and associated
>>> reporting task.
>>> Regards,
>>> Matt
>>> On Wed, Feb 28, 2018 at 1:42 PM, Mike Thomsen <>
>>> wrote:
>>> > I have Atlas 0.8.2 (BerkeleyDB and Embedded ES) and NiFi 1.6.0 nightly
>>> > both
>>> > up and claiming that they can talk to one another.
>>> >
>>> > What should I be seeing if they are? My test configuration consists of
>>> > a
>>> > simple process group that has GetMongo, UpdateAttributes and
>>> > PutElasticSearchHttpRecord. I'm not sure if events are actually making
>>> > it.
>>> >
>>> > The Atlas documentation is pretty limited on setting up a vanilla
>>> > installation, so I was wondering if someone could point me in the right
>>> > direction from a NiFi point of view on what I should be seeing so I can
>>> > start fumbling around in the right direction.
>>> >
>>> > Thanks,
>>> >
>>> > Mike

View raw message