manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Farago (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CONNECTORS-1591) RTF comment parsing problem
Date Tue, 12 Mar 2019 06:59:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790287#comment-16790287
] 

Zoltan Farago edited comment on CONNECTORS-1591 at 3/12/19 6:58 AM:
--------------------------------------------------------------------

[~kwright@metacarta.com] the output is an Elastic index. Comments in all other filetypes (.doc,
.xls, .pdf, .dcx, .odt, etc) are separated with space from the content text. 

in RTF files the space is missing.


was (Author: zfarago):
the output is an Elastic index. Comments in all other filetypes (.doc, .xls, .pdf, .dcx, .odt,
etc) are separated with space from the content text. 

> RTF comment parsing problem
> ---------------------------
>
>                 Key: CONNECTORS-1591
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1591
>             Project: ManifoldCF
>          Issue Type: Bug
>            Reporter: Zoltan Farago
>            Priority: Major
>         Attachments: comment.rtf, result.txt
>
>
> We have a problem with Manifold/Tika. When a comment is parsed from and RTF file, the
result has no separator. see attachments



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message