nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Taft <>
Subject Re: How to ingest files into HDFS via Apache NiFi from non-hadoop environment
Date Tue, 27 Jun 2017 20:24:19 GMT
This is a bit outside of the box, but I have actually implemented this
solution previously.

My scenario was very similar.  NIFI was installed outside of the firewalled
HDFS cluster.  The only external access to the HDFS cluster was through SSH.

Therefore, my solution was to use SSH to call a remote command on the HDFS
node.  This was enabled using the ExecuteStreamCommand processor.  I used
the hadoop fs command line tools, piping in the contents of the flowfile.

The basic command (assuming put) would look something like this:

$>  cat file.ext | hadoop fs -put - /hdfs/path/file.ext

This would read from standard input and store the stream into file.ext.
Next you add the SSH execution to call the above.

$>  cat file.ext | ssh user@remote 'hadoop fs -put - /hdfs/path/file.ext'

Now we can try to put the above into the ExecuteStreamCommand processor.
We will extract the filename from the flowfile attribute.  I like using
bash to execute my script:

Command Path:  /bin/bash
Command Arguments: -c; "ssh user@remote 'hadoop fs -put -
/hdfs/path/${filename}'"    * unsure of the quotes here

Not sure if the above helps, since it sounds like you're going for
something more than 'get' and 'put'.  But the above is an easy mechanism to
interact with an HDFS cluster if the NIFI node is not running on the

On Fri, Jun 23, 2017 at 2:53 PM, Mothi86 <> wrote:

> Okay thanks so that clarifies that NiFi will not work in terms of
> integrating
> from local machine / non-hadoop environment to hadoop environment. It
> either
> has to be in edge node or built up a node similar restriction of edge or
> management node.
> Is this HDF recommended solution ?
> Will spinning a VM work ? Can you suggest me VM requirements for Apache
> NiFi
> ?
> --
> View this message in context: http://apache-nifi-developer-
> Apache-NiFi-from-non-hadoop-environment-tp16247p16252.html
> Sent from the Apache NiFi Developer List mailing list archive at

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message