manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmet Arslan (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-491) Solr Connector does not treat metadata fields with semicolons in them properly
Date Fri, 13 Jul 2012 10:32:33 GMT


Ahmet Arslan commented on CONNECTORS-491:

Hi Karl, here is the stack trace Salih sent me. It seems that problem is not the ";" character.
Multiple values for author are empty string and 6;#Emek Kilic.

I can see that MFC sets &literal.Author=36;#Emek+Kilic. Could it be that tika extracts
empty string as Author metedata?

bq. fields are generated by Tika or passed in as literals via literal.fieldname=value. Before
Solr4.0 or if literalsOverride=false, then literals will be appended as multi-value to tika
generated field.


INFO: [collection1] webapp=/solr path=/update/extract params=
} {} 0 120
Jul 12, 2012 2:16:40 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: ERROR: [doc=http://iknow:2525/Documents/Vekaletname.pdf]
multiple values encountered for non multiValued field author: [, 36;#Emek Kilic]
> Solr Connector does not treat metadata fields with semicolons in them properly
> ------------------------------------------------------------------------------
>                 Key: CONNECTORS-491
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Lucene/SOLR connector
>    Affects Versions: ManifoldCF 0.5.1
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 0.7
> The Solr connector does not escape metadata values that have ";" characters in them,
and Solr thus tries to treat these as multi-valued fields.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message