nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Herssens <chris.herss...@gmail.com>
Subject Re: Data anonymization in Nifi
Date Mon, 23 Oct 2017 05:23:33 GMT
Hello Vyshali

below you can find  python  code example for hashing the fourth column of a
CSV file using the ExecuteScript processor
If you hash a field using SHA256 then the length of the field is changed.
A sha256 is 256 bits long

import hashlib
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

def hashField(text):
        return hashlib.sha256(text.encode('ascii')).hexdigest()

class convertStream(StreamCallback):
  def __init__(self):
        pass
  def process(self,inputStream,outputStream):
    text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1)
    output=[]
    for line in text.splitlines():
                l=line.split(';')
                l[3] = hashField(l[3].lower())
                l.append(l[3]+"_"+l[0]+"_"+l[1])
                output.append(';'.join(l))
    out='\n'.join(output)
    outputStream.write(out.encode('latin-1'))

flowfile = session.get()
if(flowfile != None):
        flowfile=session.write(flowfile,convertStream())
        flowfile = session.putAttribute(flowfile, "filename",
flowfile.getAttribute('filename').split('.')[0]+'_hashed')
        session.transfer(flowfile, REL_SUCCESS)
        session.commit()



Regards,

Chris

On Fri, Oct 20, 2017 at 7:19 PM, Vyshali <vyshali.n@honeywell.com> wrote:

> Hi Chris,
>
> Thanks for the suggestion.Should I have code in python or some languagues
> for hashing the data using exectescript processor ? If so,will the format
> of
> the data be detained after hashing.
> Please provide some clarity on that.
>
> Thanks,
> Vyshali
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message