nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy LoPresto <alopre...@apache.org>
Subject Re: FuzzyHashContent/CompareFuzzyHash processor
Date Fri, 06 Oct 2017 23:00:53 GMT
Hi Shankha,

The fuzzy hash processors operate on the content of the flowfile. You would first use a processor
to ingest the “data file” content. This could be something like GetFile, GetHDFS, GetSFTP,
InvokeHTTP, etc. depending on the source of the file. Once that step is done, the flowfile
content will contain the data file bytes. If you want to perform the fuzzy hash calculation
on the entire data file content, you can connect the success relationship from the ingest
processor directly to FuzzyHashContent, and the resulting flowfile will contain an attribute
with the calculated hash value. If you want to perform the calculation over only specific
parts of the flowfile, you can use a processor to manipulate the content, for example EvaluateJsonPath,
EvaluateXPath, ReplaceText, etc.

You can see an example flow which uses these processors in slide 21 of a presentation [1]
André Fucs de Miranda and I gave recently, and André has published the flow XML here [2].

[1] https://github.com/alopresto/slides/blob/master/dws_sydney_2017/the_power_of_intelligent_flows.pdf
<https://github.com/alopresto/slides/blob/master/dws_sydney_2017/the_power_of_intelligent_flows.pdf>
[2] https://github.com/fluenda/dataworks_summit_iot_botnet

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Oct 6, 2017, at 4:27 AM, shankhamajumdar <shankha.majumdar@lexmark.com> wrote:
> 
> Hi,
> 
> I want to implement fuzzy logic on some fields in a data file using NiFi. I
> am trying to use  FuzzyHashContent/CompareFuzzyHash processor but not sure
> how to implement the flow. Can you please provide me an example?
> 
> Regards,
> Shankha
> 
> 
> 
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/


Mime
View raw message