manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingchun Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-1079) the parsing in TikaExtractor always return empty result
Date Thu, 23 Oct 2014 08:18:34 GMT
Mingchun Zhao created CONNECTORS-1079:
-----------------------------------------

             Summary: the parsing in TikaExtractor always return empty result
                 Key: CONNECTORS-1079
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1079
             Project: ManifoldCF
          Issue Type: Bug
          Components: Tika extractor
    Affects Versions: ManifoldCF 2.0
            Reporter: Mingchun Zhao


When I use latest trunk source(2.0) to try the Tika content extractor,It did not return any
expected results.
I looked at it using debugging tools, found that the parser of Tika content extractor does
not return any data.
I've tried to move lib/tika-core-1.6.jar into connector-lib/, 
Then, the Tika content extractor returned data as expected.

My configurations are as below:
==
Transformation:
 Type: Tika content extractor
Output:
 Type:Solr(Use extract update handler=false)
Repository:
 type: Web
Job:
 1.type: repository
 2.type: transformation
 3.type: output
==

Maybe, it is related to CONNECTORS-1074(?), 
It looks like that the place of tika-core-1.6.jar affects the result of TikaExtractor.
 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message