nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee Laim <lee.l...@gmail.com>
Subject Re: Nifi Capability for Fast transfer of Data
Date Sat, 26 Nov 2016 02:21:13 GMT
Shweta,

While this may deviate from your initial requirements, NiFi offers the ability to compress,
resize, and extract metadata from your images.   You can use NiFi to build a image-processing
pipeline for incoming images to prioritize and route ~10% of images data that needs to arrive
in 4 seconds.  The rest  of the images will show up shortly after.   Resizing and compression,
where applicable, can also  help now you towards your goal. 

Have fun, 
Lee

On Nov 25, 2016, at 6:37 PM, Andy LoPresto <alopresto.apache@gmail.com> wrote:

> Unless my back of the envelope math is way off, to transfer 50GB (400Gb) per second,
you would need 40 parallel 10GbE connections, assuming absolutely no overhead. Your precision
for "a few seconds" would need to be 40+ seconds using a single 10 GbE link and optimal transmission
speed. 
> 
> From the Apache NiFi Overview document: 
> 
> "for something concrete and broadly applicable, consider the out-of-the-box default implementations.
These are all persistent with guaranteed delivery and do so using local disk. So being conservative,
assume roughly 50MB per second read/write rate on modest disks or RAID volumes within a typical
server. NiFi for a large class of dataflows then should be able to efficiently reach 100MB
per second or more of throughput. "
> 
> Those numbers are at least 18 months old, so with a robust cluster of 8 high-performance
machines and an optimized flow to balance computation across all the boxes, I would ballpark
a perfect world estimate at 1Gbps. My last knowledge of HDFS write speeds was around 10-20Gbps.
Again, if your tolerance for the full process is 40-50 seconds, NiFi should be able to keep
up, but your uplink will probably be the long pole in the tent here. 
> 
> Feel free to correct any poor assumptions or bad math above. 
> 
> Andy LoPresto
> alopresto@apache.org
> alopresto.apache@gmail.com
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Nov 24, 2016, at 20:48, shweta Aggarwal <shweta.agg1982@gmail.com> wrote:
>> 
>> Hi folks,
>> 
>> We have a requirement in one of our time critical application wherein we
>> are looking for transferring upto 40-50 GBs worth images
>> within few seconds between remote machine and HDFS.
>> 
>> Assuming network connectivity between the two is on a 10Gbe link and NIC
>> and socket buffers tuned optimally to give best performance , does Nifi
>> have a capability  to support desired performance using a combination of
>> "getFile" and "putHDFS" on a high ended cluster of  >8 nodes.
>> 
>> We are also exploring a combination of HDFS+GrdiFTP for fast transfer of
>> images from remote machine to HDFS cluster.
>> 
>> Any thoughts or pointers shall be helpful.
>> 
>> Thanks!!

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message