manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingchun Zhao (JIRA)" <>
Subject [jira] [Created] (CONNECTORS-1079) the parsing in TikaExtractor always return empty result
Date Thu, 23 Oct 2014 08:18:34 GMT
Mingchun Zhao created CONNECTORS-1079:

             Summary: the parsing in TikaExtractor always return empty result
                 Key: CONNECTORS-1079
             Project: ManifoldCF
          Issue Type: Bug
          Components: Tika extractor
    Affects Versions: ManifoldCF 2.0
            Reporter: Mingchun Zhao

When I use latest trunk source(2.0) to try the Tika content extractor,It did not return any
expected results.
I looked at it using debugging tools, found that the parser of Tika content extractor does
not return any data.
I've tried to move lib/tika-core-1.6.jar into connector-lib/, 
Then, the Tika content extractor returned data as expected.

My configurations are as below:
 Type: Tika content extractor
 Type:Solr(Use extract update handler=false)
 type: Web
 1.type: repository
 2.type: transformation
 3.type: output

Maybe, it is related to CONNECTORS-1074(?), 
It looks like that the place of tika-core-1.6.jar affects the result of TikaExtractor.

This message was sent by Atlassian JIRA

View raw message