nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sivaprasanna <>
Subject Re: Implementation of ListFile's Primary Node only in a cluster
Date Sat, 24 Feb 2018 15:17:45 GMT
Think it was a cache issue. It works as intended. Looks like removing the
executionNode === PRIMARY from nf-processor-details.js
and nf-processor-configuration.js
alone is enough. However, I want to confirm it here with the community
whether it is okay to remove that.

On Fri, Feb 23, 2018 at 10:28 PM, Sivaprasanna <>

> I have started working on an annotation implementation wherein the
> developer can use that annotation to indicate that processor is supposed to
> be set to run only on 'Primary node'. Framework side of things work just
> fine. However, for UI side there are a couple of questions and issues:
>    1. nf-processor-details.js
> <>
> and nf-processor-configuration.js
> <>
> checks if the setup 'isClustered' or 'executionNode === PRIMARY' which
> confuses me. Checking ' nfClusterSummary.isClustered()' alone is enough,
> right? The reason is, since we are also checking 'executionNode ===
> Primary', even for single instance NiFi i.e. non clustered setup, the
> 'execution-node-options' will be rendered for processors marked with this
> annotation.
>    2. In order to avoid this, I made a change to the code and removed the
> 'executionNode === PRIMARY' condition check in the mentioned files. Even
> after that, 'execution-node-options' is being rendered. Am I missing
> something?
> I have pushed these changes to my remote repo. Here is the link:
> dfc5d4dad3
> BTW, right now I have implemented it in this way : If the annotation is
> present, at the time of processor creation/instantiation, the executionNode
> will be set to 'PRIMARY'. However this can be changed later by configuring
> the processor from the UI. Should we think about disabling the 'Execution
> Node' configuration altogether (from UI) for a processor marked with this
> annotation (which makes more sense to me but kinda seems to be restricting
> the users' liberty from choosing according their wish) ?
> On Sun, Feb 11, 2018 at 12:59 AM, Bryan Bende <> wrote:
>> Currently it means that the dataflow manager/developer is expected to
>> set the 'Execution Nodes' strategy to "Primary Node" at the time of
>> flow design.
>> We don't have anything that restricts the scheduling strategy of a
>> processor, but we probably should consider having an annotation like
>> @PrimaryNodeOnly that you can put on a processor and then the
>> framework will enforce that it can only be scheduled on primary node.
>> In the case of ListFile, I think the statement in the documentation is
>> only partially true...
>> When "Input Directory Location" is set to local, there should be no
>> issue with scheduling the processor on all nodes in the cluster, as it
>> would be listing a local directory and storing state locally.
>> When "Input Directory Location" is set to remote, it wouldn't make
>> sense to have all nodes listing the same remote directory and getting
>> the same results, and also the state is then stored in ZooKeeper under
>> a ZNode using the processor's UUID, and the processor has the same
>> UUID on each node so they would be overwriting each other's state in
>> ZK.
>> So ListFile probably can't be restricted to primary node only, where
>> as something like ListHDFS probably could because it is always listing
>> a remote destination.
>> On Fri, Feb 9, 2018 at 10:55 PM, Sivaprasanna <>
>> wrote:
>> > I was going through ListFile processor's code and found out that in the
>> > documentation
>> > <
>> /nifi-standard-bundle/nifi-standard-processors/src/main/
>> java/org/apache/nifi/processors/standard/>,
>> > it is mentioned that "this processor is designed to run on Primary Node
>> > only in a cluster". I want to understand what "designed" stands for
>> here.
>> > Does that mean the processor was built in a way that it only runs on the
>> > Primary node regardless of the "Execution Nodes" strategy set to
>> otherwise
>> > or does it mean that dataflow manager/developer is expected to set the
>> > 'Execution Nodes' strategy to "Primary Node" at the time of flow
>> design? If
>> > it is of the former case, how is it handled in the code? If it is
>> handled,
>> > it should be in the framework side but I don't see any annotation
>> > indicating anything related to such mechanism in the processor code and
>> > more over a related JIRA NIFI-543
>> > <> is also open so I want
>> > clear my doubt.
>> >
>> > -
>> > Sivaprasanna

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message