lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: How to filter fields with hits from result set
Date Wed, 23 May 2007 18:59:35 GMT
As luck would have it, I've done something very similar. What I had
to do is index a special token at the end of each page. Then I could
get the term offsets for each page....

Then I used one of the SpanQuery.getSpans to get all of the
offsets of the hits throughout all of the pages.

now I have a list of all the offsets of the *last* term on each
page and a list of the offsets of the hits. From these two
lists I can know which pages have hits.


On 5/23/07, Andreas Guther <> wrote:
> Hi,
> If a search returns a document that has multiple fields with the same
> name, is there a way to filter only those fields that contain hits?
> Background:
> I am indexing documents and we store all content in our index for
> display reasons.  We want to show only those pages containing hits.  My
> first implementation was saving each page in a Lucene document.  For
> performance reasons why are now looking into indexing the complete
> indexed document as a single Lucene document.
> Every page is added to a field in the Lucene document named
> page-content.  That means I am ending with as many fields named
> page-content as the document has pages.
> My search now returns me a single Lucene document in contrary to my
> first approach with page per Lucene document.  My problem right now is:
> how can I limit the returned page-contents fields for pages to those
> field entries that contain hits.  If I have hits on pages five pages
> from a document with 10 pages I would like to have only the pages with
> the hits, not all.
> Is there anything in Lucene that limits the returned fields to fields
> with hits only?
> Thanks in advance,
> Andreas
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message