lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <>
Subject Re: Highlighting text, do I seriously have to reimplement this from scratch?
Date Wed, 05 Feb 2014 12:07:02 GMT
On Wed, Feb 5, 2014 at 4:16 AM, Earl Hood <> wrote:
> Our current solution is to do highlighting on the client-side.  When
> search happens, the search results from the server includes the parsed
> query terms so the client has an idea of which terms to highlight vs
> trying to reimplement a complete query string parser in the client.
> A problem is that Lucene (we are still on v3.0.3) does not provide a
> robust mechanism for extracting the terms of a query.  The following is
> the utility method that the server uses to get the terms needed to
> support client-side highlighting:
>   public static Set<Term> extractTermsFromQuery(
>       Query q,
>       IndexReader r,
>       Set<Term> terms
>   )
[ ... ]

This is very similar to what we're doing now, actually.

It does avoid the mess with having to double-parse the query, but the
catch is we still have to double-parse the text (and the text is
nearly always larger.)

All the special cases for fuzzy queries, regex queries, phrase queries
and the like, having to dig inside queries to pull out filters,
sometimes having to dig inside filters to pull out queries (had to
modify the Lucene API here and there to make more of it public, as I

I just thought it would be nice to be able to find all the matches,
pull just those bits of the text somehow and display them without
reading the rest of the text. At least, without reading the rest of
the text all the time. I think I would have to store something about
where the lines wrap in the database in order to really avoid reading
all the text. :/


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message