manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
Date Wed, 26 Aug 2015 07:23:45 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712642#comment-14712642
] 

Karl Wright commented on CONNECTORS-1234:
-----------------------------------------

Hi Abe-san,

I see no reason to include a binary content length check in this connector, since the Document
Filter transformation connector has exactly the same functionality.  We need to be careful
not to duplicate functionality unnecessarily, or we will have a very messy situation.  Or
am I missing something?

> TikaExtractor based indexing on Elasticsearch connector
> -------------------------------------------------------
>
>                 Key: CONNECTORS-1234
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234
>             Project: ManifoldCF
>          Issue Type: Improvement
>            Reporter: Shinichiro Abe
>            Assignee: Shinichiro Abe
>         Attachments: CONNECTORS-1234.patch
>
>
> We could add the use-mapper-attachments flag.
> Default to true, current spec which asks for mapper-attachments plugin on ES side.
> If false, it allows us to index the content and metadata that extracted from files through
Tika transformer, which means there is no need to install that plugin and put base64 encoded
content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message