manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes
Date Wed, 20 Feb 2019 11:18:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772912#comment-16772912
] 

Karl Wright commented on CONNECTORS-1563:
-----------------------------------------

[~Subasini], the "error" is because it does not recognize a specific translation bundle for
your language, so it defaults to English.  It is harmless.

I asked you to *try* working with a File System connection initially to narrow down where
your problems were coming from.  Please do so.  [~shinichiro abe] and myself both tried a
configuration similar to the one you report end of last year when we were debugging the 2.11
release of ManifoldCF.

> SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have
> 0 bytes
> -----------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1563
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1563
>             Project: ManifoldCF
>          Issue Type: Task
>          Components: Lucene/SOLR connector
>            Reporter: Sneha
>            Assignee: Karl Wright
>            Priority: Major
>         Attachments: Document simple history.docx, Manifold and Solr settings_CustomField.docx,
managed-schema, manifold settings.docx, manifoldcf.log, path.png, schema.png, solr.log, solrconfig.xml
>
>
> I am encountering this problem:
> I have checked "Use the Extract Update Handler:" param then I am getting an error on
Solr i.e. null:org.apache.solr.common.SolrException: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
> If I ignore tika exception, my documents get indexed but dont have content field on Solr.
> I am using Solr 7.3.1 and manifoldCF 2.8.1
> I am using solr cell and hence not configured external tika extractor in manifoldCF pipeline
> Please help me with this problem
> Thanks in advance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message