nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Wicks (pwicks)" <>
Subject Reading Email Message Body
Date Mon, 30 Oct 2017 06:53:16 GMT
A coworker and I were troubleshooting a bug in the ConsumeEWS processor where Unicode characters
were being read as ASCII.
I figured out there was a bug in my code for ConsumeEWS and plan to fix it, but as part of
the research I found that the way Unicode text in the email is outputted to the FlowFile is
not easy to work with; in general the whole email body is hard to work with. If there are
attachments in there and all you want is the body it's even more of a mess.

How are other users reading the email message body? Has anyone else run into the issue with
Unicode characters?

In my scenario, we see the auto-quotes/semicolons from Outlook's Word interface becoming '?'
characters, and with my fix in place they are written to the flow file using some kind of
serialization format:

"Where there's NiFi there is Happiness" becomes:

=E2=80=9CWhere there=E2=80=99s NiFi there is Happiness=E2=80=9D.

Is there a need for a new Email processor that extracts the message body by deserializing
the FlowFile and reading out the body?

View raw message