nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-4326) ExtractEmailHeaders.java unhandled Exceptions
Date Fri, 01 Sep 2017 12:47:00 GMT

    [ https://issues.apache.org/jira/browse/NIFI-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150452#comment-16150452
] 

ASF GitHub Bot commented on NIFI-4326:
--------------------------------------

Github user btwood commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2111#discussion_r136566431
  
    --- Diff: nifi-nar-bundles/nifi-email-bundle/nifi-email-processors/src/main/java/org/apache/nifi/processors/email/ExtractEmailHeaders.java
---
    @@ -168,21 +173,40 @@ public void process(final InputStream rawIn) throws IOException
{
                                 }
                             }
                         }
    -                    if (Array.getLength(originalMessage.getAllRecipients()) > 0) {
    -                        for (int toCount = 0; toCount < ArrayUtils.getLength(originalMessage.getRecipients(Message.RecipientType.TO));
toCount++) {
    -                            attributes.put(EMAIL_HEADER_TO + "." + toCount, originalMessage.getRecipients(Message.RecipientType.TO)[toCount].toString());
    +
    +                    // Get Non-Strict Recipient Addresses
    +                    InternetAddress[] recipients;
    +                    if (originalMessage.getHeader(Message.RecipientType.TO.toString(),
",") != null) {
    +                        recipients = InternetAddress.parseHeader(originalMessage.getHeader(Message.RecipientType.TO.toString(),
","), false);
    +                        for (int toCount = 0; toCount < ArrayUtils.getLength(recipients);
toCount++) {
    +                            attributes.put(EMAIL_HEADER_TO + "." + toCount, recipients[toCount].toString());
                             }
    -                        for (int toCount = 0; toCount < ArrayUtils.getLength(originalMessage.getRecipients(Message.RecipientType.BCC));
toCount++) {
    -                            attributes.put(EMAIL_HEADER_BCC + "." + toCount, originalMessage.getRecipients(Message.RecipientType.BCC)[toCount].toString());
    +                    }
    +                    if (originalMessage.getHeader(Message.RecipientType.BCC.toString(),
",") != null) {
    +                        recipients = InternetAddress.parseHeader(originalMessage.getHeader(Message.RecipientType.BCC.toString(),
","), false);
    +                        for (int toCount = 0; toCount < ArrayUtils.getLength(recipients);
toCount++) {
    +                            attributes.put(EMAIL_HEADER_BCC + "." + toCount, recipients[toCount].toString());
                             }
    -                        for (int toCount = 0; toCount < ArrayUtils.getLength(originalMessage.getRecipients(Message.RecipientType.CC));
toCount++) {
    -                            attributes.put(EMAIL_HEADER_CC + "." + toCount, originalMessage.getRecipients(Message.RecipientType.CC)[toCount].toString());
    +                    }
    +                    if (originalMessage.getHeader(Message.RecipientType.CC.toString(),
",") != null) {
    +                        recipients = InternetAddress.parseHeader(originalMessage.getHeader(Message.RecipientType.CC.toString(),
","), false);
    +                        for (int toCount = 0; toCount < ArrayUtils.getLength(recipients);
toCount++) {
    +                            attributes.put(EMAIL_HEADER_CC + "." + toCount, recipients[toCount].toString());
                             }
                         }
    -                    // Incredibly enough RFC-2822 specified From as a "mailbox-list"
so an array I returned by getFrom
    -                    for (int toCount = 0; toCount < ArrayUtils.getLength(originalMessage.getFrom());
toCount++) {
    -                        attributes.put(EMAIL_HEADER_FROM + "." + toCount, originalMessage.getFrom()[toCount].toString());
    +
    +                    // Get Non-Strict Sender Addresses
    +                    InternetAddress[] sender = null;
    +                    if (originalMessage.getHeader("From",",") != null) {
    +                        sender = (InternetAddress[])ArrayUtils.addAll(sender, InternetAddress.parseHeader(originalMessage.getHeader("From",
","), false));
    +                    }
    +                    if (originalMessage.getHeader("Sender",",") != null) {
    +                        sender = (InternetAddress[])ArrayUtils.addAll(sender, InternetAddress.parseHeader(originalMessage.getHeader("Sender",
","), false));
    --- End diff --
    
    My logic here was that I wanted ALL of the From/Sender addresses. So mailbox-list or not,
in sender or not, this would collect them all. Note that I'm merging them. So if they are
both present, then they will both be added.
    
    Again, I'll have to re-read the RFC to see if this is correct. Based on the implementation
of getFrom() I found on grepcode though, I figured it was. 
    
    Perhaps a bad assumption though, because having read [RFC 822](https://www.ietf.org/rfc/rfc822.txt)
a lot of implementations get email addresses wrong. I've seen plenty of accepted mail like
" "@example.com I think the break-down is in the SHOULD/MUST contract, where in a mail server
SHOULD accept that address.
    
    Let me read up on the additional RFCs and get back to you. I can also do some digging
in my mail archive to see what postfix has accepted/interpreted in the past. I've seen a lot
of email address regexes that break because they don't assume "this is a valid address"@example.com
is valid, even though postfix accepted it.


> ExtractEmailHeaders.java unhandled Exceptions
> ---------------------------------------------
>
>                 Key: NIFI-4326
>                 URL: https://issues.apache.org/jira/browse/NIFI-4326
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.3.0
>         Environment: jdk 1.8.0_121-b13
>            Reporter: Benjamin Wood
>            Priority: Minor
>             Fix For: 1.4.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> The ExtractEmailHeaders  processor throws a NullPointerException if there is no TO, CC,
and BCC recipients.
> If there are no recipients "originalMessage.getAllRecipients()" returns NULL, and not
a 0 length array.
> If an address is empty (<> or " ") then getRecipients() will throw an "Empty Address"
AddressException
> It's possible this is only an issue with Oracle Java.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message