lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stanislaw Osinski (Resolved) (JIRA)" <>
Subject [jira] [Resolved] (SOLR-2939) Clustering of multilingual search results
Date Sat, 17 Dec 2011 13:44:33 GMT


Stanislaw Osinski resolved SOLR-2939.

       Resolution: Fixed
    Fix Version/s: 4.0

In trunk and branch_3x. Wiki page updated. The language code variable expansion in field names
has not yet been implemented, I'll move it to a dedicated issue.
> Clustering of multilingual search results
> -----------------------------------------
>                 Key: SOLR-2939
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - Clustering
>            Reporter: Stanislaw Osinski
>            Assignee: Stanislaw Osinski
>             Fix For: 3.6, 4.0
> Carrot2 internally supports clustering of multilingual search results. The clustering
component should allow passing a language field to Carrot2. This feature would need at least
two new parameters: {{carrot.lang}} for the name of Solr field that contains the language
code (ISO 639) and a {{carrot.lcmap}} field similar to the one in language recognizer to map
arbitrary strings to ISO 639 codes.
> Another feature of language recognizer we should mirror is the expansion of the {{{lang}}}
token in field names into the language code of the document (in case of multiple languages
per document -- the first Carrot2-supported language code). The feature seems easy to implement
in the non-distributed setting of Solr, but the simple implementation isn't going to work
in the distributed setting because the name of the specific field to be fetched depends on
the content (language) of each matching document. Looking at the {{SearchClusteringEngine.getFieldsToLoad(SolrQueryRequest)}}
method, a quick but costly solution would be to load the contents of all stored fields. I'm
not too strong in distributed-mode Solr, but maybe this could be optimized so that only the
required fields get fetched?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message