lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: repeating fields
Date Wed, 07 Dec 2005 14:55:27 GMT

On Dec 7, 2005, at 8:48 AM, Reza Ghaffaripour wrote:
> I think having different documents  will not be a good idea.
> for me each xml is an ebook. and "p" means paragraph.
> i have hundereds of paragraphs in every ebook. and i think i should  
> keep
> each ebook in a single
> document. am i right ?

How you design your index requires consideration of all you're trying  
to do with it.  It's an art form, in fact.  So while we can offer  
some ideas, ultimately you have to find what fits.   The granularity  
of what you index as a Document is the granularity of what you get  
back from searches as Hits.

There are blended approaches - an index does not have to be  
homogeneous in Document design.  You could have documents that  
represent the entire e-book, and documents that represent each  
paragraph.  You can use a field on each document "type" to  
distinguish them and filter in a search.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message