lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vatuska <vatu...@yandex.ru>
Subject Language detection for multivalued field
Date Tue, 22 Oct 2013 10:59:26 GMT
Is there any way to define language for multivalued field?
Seems it doesn't work if there are several values with different languages
in the documents.

*I have multivalued field in schema.xml*
...
<field name="tag" type="text_general" indexed="true" stored="true"
required="false" multiValued="true"/>
...
<dynamicField name="*_undfnd" type="text_general" indexed="true"
stored="true" multiValued="true"/>
<dynamicField name="*_en" type="text_en_splitting" indexed="true"
stored="true" multiValued="true"/>

*And I have configured UpdateRequestProcessorChain*

<updateRequestProcessorChain name="langid">
       <processor
class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
         <str name="langid.fl">tag</str>
         <str name="langid.langField">lang_global</str>
		 <str name="langid.langFields">langs</str>
		 <bool name="langid.map">true</bool>
		 <bool name="langid.map.individual">true</bool>
		 <bool name="langid.map.keepOrig">true</bool>
		 <str name="langid.fallback">undfnd</str>
		 <str name="langid.whitelist">en,en_GB,en_US</str>
		 <str
name="langid.map.individual.fl">title,source,tag,creatorName,description</str>
		 <str name="langid.map.lcmap">en_GB:en en_US:en</str>
       </processor>
       <processor class="solr.LogUpdateProcessorFactory"/>
       <processor class="solr.RunUpdateProcessorFactory"/>
     </updateRequestProcessorChain>

*All works fine for document like:*
...
<field name="tag">My test tag</field>
...

*And all works fine for document like*
...
<field name="tag">test</field>
<field name="tag">first</field>
<field name="tag">My tag</field>
...

*But for* 
...
<field name="tag">español</field>
<field name="tag">first</field>
<field name="tag">My tag</field>
...
*There isn't tag indexed*
*But I expect*
tag_en : first, My tag
tag_undfnd : español

Is there any way to fix this?



--
View this message in context: http://lucene.472066.n3.nabble.com/Language-detection-for-multivalued-field-tp4096996.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message