nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy LoPresto <alopresto.apa...@gmail.com>
Subject Re: Nifi Capability for Fast transfer of Data
Date Sat, 26 Nov 2016 01:37:55 GMT
Unless my back of the envelope math is way off, to transfer 50GB (400Gb) per second, you would
need 40 parallel 10GbE connections, assuming absolutely no overhead. Your precision for "a
few seconds" would need to be 40+ seconds using a single 10 GbE link and optimal transmission
speed. 

From the Apache NiFi Overview document: 

"for something concrete and broadly applicable, consider the out-of-the-box default implementations.
These are all persistent with guaranteed delivery and do so using local disk. So being conservative,
assume roughly 50MB per second read/write rate on modest disks or RAID volumes within a typical
server. NiFi for a large class of dataflows then should be able to efficiently reach 100MB
per second or more of throughput. "

Those numbers are at least 18 months old, so with a robust cluster of 8 high-performance machines
and an optimized flow to balance computation across all the boxes, I would ballpark a perfect
world estimate at 1Gbps. My last knowledge of HDFS write speeds was around 10-20Gbps. Again,
if your tolerance for the full process is 40-50 seconds, NiFi should be able to keep up, but
your uplink will probably be the long pole in the tent here. 

Feel free to correct any poor assumptions or bad math above. 

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 24, 2016, at 20:48, shweta Aggarwal <shweta.agg1982@gmail.com> wrote:
> 
> Hi folks,
> 
> We have a requirement in one of our time critical application wherein we
> are looking for transferring upto 40-50 GBs worth images
> within few seconds between remote machine and HDFS.
> 
> Assuming network connectivity between the two is on a 10Gbe link and NIC
> and socket buffers tuned optimally to give best performance , does Nifi
> have a capability  to support desired performance using a combination of
> "getFile" and "putHDFS" on a high ended cluster of  >8 nodes.
> 
> We are also exploring a combination of HDFS+GrdiFTP for fast transfer of
> images from remote machine to HDFS cluster.
> 
> Any thoughts or pointers shall be helpful.
> 
> Thanks!!

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message