lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Caruana, Matthew" <mcaru...@icij.org>
Subject Stored value for highlighting from different field?
Date Wed, 01 Mar 2017 17:03:03 GMT
We’re currently using copyField directives in our schema to copy the same text to different
fields that use different analysers. For example, assuming the original field contained in
the document payload sent to the update handler is called “tika_output", it is copied to
“text”, “text_case_sensitive” and “text_accent_sensitive”.

In order to avoid inflating the size of the index, “tika_output" has indexed=false and stored=true,
while “text” and friends have indexed=true and stored=false.

We’re using the unified highlighter. I’ve read the code in UnifiedHighlighter.java, which
clearly shows that the field to be highlighted must be stored. Therefore, searching on text_case_sensitive
doesn’t yield highlighted results. Storing the field value redundantly would mean tripling
my storage costs.

I see that other people have brought up this issue before:

https://issues.apache.org/jira/browse/SOLR-1105
https://issues.apache.org/jira/browse/SOLR-5276

Is there anything that can be done? If it comes down to subclassing the unified highlighter,
does anyone have any recommendations for doing this?
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message