nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Woschitz, Janosch" <Janosch.Wosch...@thinkbiganalytics.com>
Subject Replication of flow file and content repositories (HA setup)
Date Wed, 19 Jul 2017 14:40:55 GMT
Hello everyone,

In general NiFi seems to support HA semantics by establishing a multi-master/no-master clustering
via Zookeeper. This works well to achieve consensus for the currently deployed flow. Though
it is not clear to me how I can prevent against local disk failure apart from relying on a
RAID10 setup.

The flow file and the content repositories are stored on local disks. If I have a rather complex
flow then a full outage of a node of my cluster could result in data loss of the data which
is “flowing” through a node at the time being.

I found the following feature proposals in the wiki which seem to address this problem:
https://cwiki.apache.org/confluence/display/NIFI/Data+Replication
https://cwiki.apache.org/confluence/display/NIFI/High+Availability+Processing

Unfortunately I was not able to find any pointers about the current state of these proposals.
Is there any active work happening in this direction?

How could one support the project to achieve the goals mentioned in these feature proposals?
I would think the work needs to broken down in smaller work package beforehand to allow a
smooth integration into the master/upstream.

Thanks,
Janosch
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message