manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1396) Email processing multipart casting problem
Date Tue, 28 Mar 2017 20:21:41 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945884#comment-15945884
] 

Karl Wright commented on CONNECTORS-1396:
-----------------------------------------

Hi [~cguzel], this is coming from the Solr output connector, but the source of the problem
is a bad mime type coming from an email attachment:

{code}
              final String mimeType = part.getContentType();
              if (!activities.checkMimeTypeIndexable(mimeType)) {
                errorCode = activities.EXCLUDED_MIMETYPE;
                errorDesc = "Excluded because of mime type ('"+mimeType+"')";
                activities.noDocument(documentIdentifier, version);
                continue;
              }

              RepositoryDocument rd = new RepositoryDocument();
              rd.setFileName(part.getFileName());
              rd.setMimeType(mimeType);
              ...
{code}

I would love to find out exactly what the mime type is that's coming from Exchange that upsets
HttpClient so badly.  Can you add a log statement to the above code so we can see the mime
types that are coming from each attachment?  Thanks!


> Email processing multipart casting problem
> ------------------------------------------
>
>                 Key: CONNECTORS-1396
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1396
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Email connector
>    Affects Versions: ManifoldCF 2.6
>            Reporter: Cihad Guzel
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.7
>
>         Attachments: CONNECTORS-1396.patch
>
>
> I try email connector with exchange server 2013. I have some errors.
> If I select the "Encoding of Attachment" from metadata tab of email connector:
> {code}
> DEBUG 2017-03-08 19:31:13,646 (Worker thread '1') - Email: Processing document identifier
'INBOX:<6bc4ff202f884bc396aab70cdfd01eb7@EXCHANGE.mydomain.com>'
> FATAL 2017-03-08 19:31:18,243 (Worker thread '1') - Error tossed: java.lang.String cannot
be cast to javax.mail.Multipart
> java.lang.ClassCastException: java.lang.String cannot be cast to javax.mail.Multipart
> 	at org.apache.manifoldcf.crawler.connectors.email.EmailConnector.processDocuments(EmailConnector.java:631)
> 	at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
> {code}
> If I select the "MIME type of attachment" from metadata tab of email connector:
> {code}
> DEBUG 2017-03-08 19:37:40,026 (Worker thread '40') - Email: Processing document identifier
'INBOX:<173298b91b9f4d439aeb14169b306553@EXCHANGE..mydomain.com>'
> FATAL 2017-03-08 19:37:40,633 (Worker thread '30') - Error tossed: java.lang.String cannot
be cast to javax.mail.Multipart
> java.lang.ClassCastException: java.lang.String cannot be cast to javax.mail.Multipart
> 	at org.apache.manifoldcf.crawler.connectors.email.EmailConnector.processDocuments(EmailConnector.java:651)
> 	at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
> {code}
> I saw similar issue : https://issues.apache.org/jira/browse/CONNECTORS-1260



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message