lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Highlighting, offsets -- external doc store
Date Tue, 29 Nov 2016 17:49:31 GMT
(1) No that I have readily at hand. And to make it
worse, there's the UnifiedHighlighter coming out soon....

I don't think there's a good way for (2).

for (3) at least yes. The reason is simple. For analyzed text,
the only thing in the index is what's made it through the
analysis chains. So stopwords are missing. Stemming
has been done. You could even have put a phonetic filter
in there and have terms like ARDT KNTR which would
be...er...not very useful to show the end user so the original
text must be available.




Not much help...
Erick

On Tue, Nov 29, 2016 at 8:43 AM, John Bickerstaff
<john@johnbickerstaff.com> wrote:
> All,
>
> One of the questions I've been asked to answer / prove out is around the
> question of highlighting query matches in responses.
>
> BTW - One assumption I'm making is that highlighting is basically a
> function of storing offsets for terms / tokens at index time.  If that's
> not right, I'd be grateful for pointers in the right direction.
>
> My underlying need is to get highlighting on search term matches for
> returned documents.  I need to choose between doing this in Solr and using
> an external document store, so I'm interested in whether Solr can provide
> the doc store with the information necessary to identify which section(s)
> of the doc to highlight in a query response...
>
> A few questions:
>
> 1. This page doesn't say a lot about how things work - is there somewhere
> with more information on dealing with offsets and highlighting? On offsets
> and how they're handled?
> https://cwiki.apache.org/confluence/display/solr/Highlighting
>
> 2. Can I return offset information with a query response or is that
> internal only?  If yes, can I return offset info if I have NOT stored the
> data in Solr but indexed only?
>
> (Explanation: Currently my project is considering indexing only and storing
> the entire text elsewhere -- using Solr to return only doc ID's for
> searches.  If Solr could also return offsets, these could be used in
> processing the text stored elsewhere to provide highlighting)
>
> 3. Do I assume correctly that in order for Solr highlighting to work
> correctly, the text MUST also be stored in Solr (I.E. not indexed only, but
> stored=true)
>
> Many thanks...

Mime
View raw message