lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "JMA" <>
Subject Best way to index document page by page?
Date Fri, 24 Jun 2005 07:28:58 GMT

  I have a requirement to search documents page by page.  For example, in a
500 page document, if someone searches for "foo", I need to return "Found
foo on page 4,6,24,100,223,401, and 455".

The way I've implemented this is to index each *page* separately, so my 500
page document is actually treated as not one but 500 documents.  Then when I
get hits, I can play sort games to aggregate the results to look as

Is this the best way to do this?  Is there a way to store location
information associated with each term within a field?  Note that there can
be thousands of documents containing thousands of pages.

Thanks in advance,

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message