nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Sampson <>
Subject Re: Docker Image Improvements for Kubernetes
Date Thu, 04 Jun 2020 11:53:25 GMT
I've been using NiFi's Docker image for a while now and thought a few notes
from the things we've done might be useful for your work:

   - Using Docker Swarm (NiFi 1.9.2)
      - Had to add some property file updates as part of a custom
      Dockerfile build because the didn't cover them (some of these
      might have already been addressed):
         - needs to be set to true for
         secure clusters
         - allow for multiple NODE_IDENTITY entries to be specified in
         authorizers.xml via environment variables (e.g. NODE_IDENTITY_1,
         NODE_IDENTITY_2, etc.) - add as "Node Identity" and "Initial
USer Identity"
         - allow configuration of ldap in authorizers.xml
            - uncommenting sections of the file
            - replacing element values/attributes with environment variables
            - add User Group Providers (we had a composite of LDAP and File
         - update to set ``
         related properties for LDAP <-> PKI mappings
         - update to set appropriate ``/``
         related entries that were found to be required to enable clustering,
         site-to-site and external connections in our Swarm setup
(hosted across
         multiple AWS EC2s with two Swarm "networks" in play)

Having been through some of the pain above, we later moved to a Kubernetes
stack and re-implemented some of our approach. Once decision we made was to
inject properties/configuration files instead of using the environment
variable replacements via (because so many things we wanted
weren't covered and we didn't want to continue trying to update the
provided via sed/awk commands in our Dockerfile to add more
commands as part of the container startup routine).

   - Using Kubernetes (NiFi 1.11.4)
      - custom Dockerfile that overrides the scripts to provide:
         - overwrite of "static" config files injected into the k8s
         StatefulSet (i.e. everything under conf/ that isn't generated
at startup)
            - we set non-dynamic & non-secure values in these files within
            our git repo then inject them into the pod
         - set dynamic properties, e.g. hostnames (for
         ``), similar to the provided script but a
         different set or properties as what we need is different to
what it provides
         - create nifi-toolkit properties files (e.g. setting `baseUrl` and
         `proxiedEntity`, etc. based on hostname & env vars)
         - set secure properties (e.g. encryption.keys) that have provided
         as files/env vars by k8s/STS
         - add "Node Identity"/"Initial User Identity" entries based on the
         k8s/STS setup (i.e. number of nodes in the cluster)
         - setup "Initial Admin Identity" (based on env var)
         - request node & initial admin certificates from a nifi-toolkit
         instance (running in server mode) then configure them in &
         nifi-toolkit properties
         - create "common" keystore & truststore files in a known location
         with a common password on each cluster node - this is
required so we can
         configure S2S reporting tasks with an SSL Controller Service
(that can only
         take a single file and password combination so has to be
common across all
         - use nifi-toolkit to encrypt conf files (after they've been
         - delete unwanted NARs from lib/
         - download required extra (apache-nifi) NARs
      - we have persisted volumes for
         - some logs (that we don't output to STDOUT)
         - persisted configuration, e.g. flow.xml.gz, users.xml,
         - each of the repositories

Retrospectively (things always look wrong when you look back, right? 😊),
some of the stuff we've done with our custom startup scripts would have
probably been better as init-containers (e.g. requesting certificates,
dynamic config changes), but things that might be worth considering from a
NiFi Docker point of view:

   - cut-down image in terms of NARs with a way to inject/download extra
   NARs as required at startup/as part of a custom build; but that said, the
   current base is probably fine and anyone wanting to delete NARs should do
   so with their own custom build, as we have
   - providing a "base" set of config files but allowing for overrides
   using files in a known directory; here I'm thinking mainly of things like
   bootstrap.conf, where you could have a conf/conf.d/01-bootstrap.conf file
   to provide extra JVM args, similar to Elasticsearch jvm.options.d
   - as you already mentioned, more property/config settings via
   environment variables
   - ability to change logging config (again could this be done with
   additional files in a separate directory maybe?)

*Chris Sampson*
IT Consultant

On Wed, 3 Jun 2020 at 13:57, Shawn Weeks <> wrote:

> I’m working on deploying NiFi to Kubernetes and I’ve ran across several
> things that could be improved.
>   1.  Currently flow.xml.gz is stored in ./conf by default which has been
> designated a Docker volume. In Kubernetes volumes are not pre-populated
> from the image so I’m left with some init container magic to copy the
> contents of ./conf to another volume and then back again otherwise ./conf
> is empty. Since we’re configuring everything via environment variables
> anyway setting nifi.flow.configuration.file and designate a volume just for
> flow.xml.gz would solve that. You could even reuse your existing conf
> volume if you haven’t changed anything.
>   2.  Expose more variables - NIFI-6232 already exists for this but hasn’t
> had any work.
>   3.  Support OpenID Login Provider
>   4.  Expose logs besides nifi-app.log

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message