manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1410) Binary Attachment Data as Plain Text at Email Content
Date Sat, 15 Apr 2017 16:51:42 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970023#comment-15970023
] 

Karl Wright commented on CONNECTORS-1410:
-----------------------------------------

[~kamaci]:  This code is not bounded in memory use; the entire message body must be read in
here in order to be decoded.  That's not allowed for ManifoldCF.

{code}
-              InputStream is = msg.getInputStream();
+              InputStream is = new ByteArrayInputStream(extractBodyContent(msg).getBytes(StandardCharsets.UTF_8));
{code}

[~cguzel] The door is closed for non-critical fixes for 2.7.  This fix has problems (described
above) and does not seem critical to me.  I am not going to hold the release for new features
at this point.


> Binary Attachment Data as Plain Text at Email Content
> -----------------------------------------------------
>
>                 Key: CONNECTORS-1410
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1410
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Email connector
>    Affects Versions: ManifoldCF 2.6
>            Reporter: Furkan KAMACI
>            Assignee: Furkan KAMACI
>             Fix For: ManifoldCF 2.8
>
>         Attachments: CONNECTORS-1410.patch
>
>
> Previously, we were indexing e-mails and its attachments together. We changed this logic
with CONNECTORS-1375 as indexing e-mail and its attachments separately.
> However, there is a problem. Content fields of emails which has attachment(s) includes
both body and attachments's binary content as plain text.
> As we index attachments separately, we can just index body as content instead of appending
email body and all attachments' binary data as plain text.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message