manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maxence SAUNIER (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
Date Wed, 25 Apr 2018 10:11:00 GMT


Maxence SAUNIER commented on CONNECTORS-1503:

Hello Karl,

I did some tests today and the problem persist. I don't know if the problem is ManifoldCF
or Solr config.

If I check used Request Handler on ManifoldCF, I have org.apache.tika.exception.ZeroByteFileException:
InputStream must have > 0 bytes. I have files with no content but if I uncheck Request
Handler on ManifoldCF Tika ignore this problem dans just not send content field. Content field
not required on my Solr Schema.

So, I have modify the /update/extract requestHandler to unactive parameters but not solved
my problem.

And, if I uncheck used Request Handler on ManifoldCF, my content is just the content file.
Without the checkbox, content = "{file_lenght} {mime_type} {other} {content_text}"

Tika is on the ManifoldCF and is a Tika exception, possible Tika parameters are differents
if I check used Request Handler? Or I have an default update processor on Solr and this have
a problem? 


> UpdateProcessor SolrCloud and ManifoldCF
> ----------------------------------------
>                 Key: CONNECTORS-1503
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Solr 6.x component
>    Affects Versions: ManifoldCF 2.9.1
>         Environment: SolrCloud 6.6
> ManifoldCF 2.9.1
>            Reporter: Maxence SAUNIER
>            Assignee: Shinichiro Abe
>            Priority: Major
>         Attachments: 20170421-1740.png, jira_update_processor.png, manifoldcf_arguments_uniqFields.png,
> Hello,
> [Link to Apache mail archive|]
> When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they arguments
on the POST request and not on the url parameters. So, for add a (pre)processor or a post-processor
with the url, it's not possible.
> [SolrConfig updateRequestProcessorChain|]
> [call UpdateRequestProcessors|]
> [Conf image|]
> Solr response:
> org.apache.solr.common.SolrException: ERROR: [doc=file://///srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc]
unknown field 'processor'

This message was sent by Atlassian JIRA

View raw message