lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: How to filter fields with hits from result set
Date Wed, 23 May 2007 18:59:35 GMT
As luck would have it, I've done something very similar. What I had
to do is index a special token at the end of each page. Then I could
get the term offsets for each page....

Then I used one of the SpanQuery.getSpans to get all of the
offsets of the hits throughout all of the pages.

now I have a list of all the offsets of the *last* term on each
page and a list of the offsets of the hits. From these two
lists I can know which pages have hits.


Best
Erick

On 5/23/07, Andreas Guther <Andreas.Guther@markettools.com> wrote:
>
> Hi,
>
> If a search returns a document that has multiple fields with the same
> name, is there a way to filter only those fields that contain hits?
>
>
> Background:
>
> I am indexing documents and we store all content in our index for
> display reasons.  We want to show only those pages containing hits.  My
> first implementation was saving each page in a Lucene document.  For
> performance reasons why are now looking into indexing the complete
> indexed document as a single Lucene document.
>
> Every page is added to a field in the Lucene document named
> page-content.  That means I am ending with as many fields named
> page-content as the document has pages.
>
> My search now returns me a single Lucene document in contrary to my
> first approach with page per Lucene document.  My problem right now is:
> how can I limit the returned page-contents fields for pages to those
> field entries that contain hits.  If I have hits on pages five pages
> from a document with 10 pages I would like to have only the pages with
> the hits, not all.
>
> Is there anything in Lucene that limits the returned fields to fields
> with hits only?
>
> Thanks in advance,
>
> Andreas
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message