manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1533) Solr Connector is unable to ingest documents
Date Sat, 22 Sep 2018 17:44:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624761#comment-16624761
] 

Karl Wright commented on CONNECTORS-1533:
-----------------------------------------

[~shinichiro abe] I get this particular error from Solr when I try to index a zero-length
file:

{code}
 WARN 2018-09-22T13:38:09,581 (Worker thread '32') - Solr exception during indexing file:/C:/wip/mcf-release-scripts/release-scripts/.svn/wc.db-journal
(500): Error from server at http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at
http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
	at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
~[?:?]
	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) ~[?:?]
	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) ~[?:?]
	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
~[?:?]
	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1106)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:886)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:819) ~[?:?]
	at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194) ~[?:?]
	at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211) ~[?:?]
	at org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:967)
~[?:?]
 WARN 2018-09-22T13:38:09,595 (Worker thread '32') - Service interruption reported for job
1537637859471 connection 'files': Solr exception during indexing file:/C:/wip/mcf-release-scripts/release-scripts/.svn/wc.db-journal
(500): Error from server at http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
 WARN 2018-09-22T13:39:09,959 (Worker thread '46') - Solr exception during indexing file:/C:/wip/mcf-release-scripts/release-scripts/.svn/wc.db-journal
(500): Error from server at http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at
http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
	at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
~[?:?]
	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) ~[?:?]
	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) ~[?:?]
	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
~[?:?]
	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1106)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:886)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:819) ~[?:?]
	at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194) ~[?:?]
	at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211) ~[?:?]
	at org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:967)
~[?:?]
 WARN 2018-09-22T13:39:09,968 (Worker thread '46') - Service interruption reported for job
1537637859471 connection 'files': Solr exception during indexing file:/C:/wip/mcf-release-scripts/release-scripts/.svn/wc.db-journal
(500): Error from server at http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
 WARN 2018-09-22T13:40:10,352 (Worker thread '40') - Solr exception during indexing file:/C:/wip/mcf-release-scripts/release-scripts/.svn/wc.db-journal
(500): Error from server at http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at
http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
	at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
~[?:?]
	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) ~[?:?]
	at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) ~[?:?]
	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
~[?:?]
	at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1106)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:886)
~[?:?]
	at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:819) ~[?:?]
	at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194) ~[?:?]
	at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211) ~[?:?]
	at org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:967)
~[?:?]
 WARN 2018-09-22T13:40:10,363 (Worker thread '40') - Service interruption reported for job
1537637859471 connection 'files': Solr exception during indexing file:/C:/wip/mcf-release-scripts/release-scripts/.svn/wc.db-journal
(500): Error from server at http://192.168.1.143:8983/solr/collection1: org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes
{code}

Standalone Solr doesn't reject zero-length documents.  Is there a way to turn off that rejection
in Solr Cloud?


> Solr Connector is unable to ingest documents
> --------------------------------------------
>
>                 Key: CONNECTORS-1533
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1533
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Lucene/SOLR connector
>    Affects Versions: ManifoldCF 2.11
>            Reporter: Julien Massiera
>            Assignee: Karl Wright
>            Priority: Major
>             Fix For: ManifoldCF 2.11
>
>         Attachments: 2018-09-23-012800.png, CONNECTORS-1533.patch
>
>
> The "r69acbd9 - Fix solr connector content deletion bug" has introduced another bug :

> It is now impossible to ingest documents into Solr 7.4.0, we obtain the following error
: Error from server at http://localhost:8983/solr/FileShare: missing content stream
> The fact is, the requestWriter.getContentWriter(request) object is equal to null only
on commit requests. So the new lines of code introduced by the fix, which are based on the
test of this object, result in a null Collection<ContentStream> streams object and
so the update request is failing.
> Concerned class : org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message