manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1410) Binary Attachment Data as Plain Text at Email Content
Date Sat, 15 Apr 2017 16:51:42 GMT


Karl Wright commented on CONNECTORS-1410:

[~kamaci]:  This code is not bounded in memory use; the entire message body must be read in
here in order to be decoded.  That's not allowed for ManifoldCF.

-              InputStream is = msg.getInputStream();
+              InputStream is = new ByteArrayInputStream(extractBodyContent(msg).getBytes(StandardCharsets.UTF_8));

[~cguzel] The door is closed for non-critical fixes for 2.7.  This fix has problems (described
above) and does not seem critical to me.  I am not going to hold the release for new features
at this point.

> Binary Attachment Data as Plain Text at Email Content
> -----------------------------------------------------
>                 Key: CONNECTORS-1410
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Email connector
>    Affects Versions: ManifoldCF 2.6
>            Reporter: Furkan KAMACI
>            Assignee: Furkan KAMACI
>             Fix For: ManifoldCF 2.8
>         Attachments: CONNECTORS-1410.patch
> Previously, we were indexing e-mails and its attachments together. We changed this logic
with CONNECTORS-1375 as indexing e-mail and its attachments separately.
> However, there is a problem. Content fields of emails which has attachment(s) includes
both body and attachments's binary content as plain text.
> As we index attachments separately, we can just index body as content instead of appending
email body and all attachments' binary data as plain text.

This message was sent by Atlassian JIRA

View raw message