lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phillip Rhodes <>
Subject What *is* a lucene document?
Date Sun, 05 Jun 2005 05:11:57 GMT
I understand that  "Documents are the primary retrievable units from a 
Lucene query"  But I don't know if I want to have 12 documents in the 
lucene index that represent the same business object, or if I should 
place 12 different business documents within the lucene index.

Here is the background:
I want to index a product catalog (some data in database and some data 
on the filesystem, I have cross-reference between the two).
Each product is associated to attributes, categories and one or more 
PDF/MS Word documents, HTML descriptions, images, etc...
A product could have 12 different files associated to it.

Is it okay if I create as many documents as assets that I want to return 
from a search and add information to each document tying it back to the 
product that it is assocated with?  Is that the right approach?

Thanks, it's keeping me up at night.

BTW, I am working on a release of a professional-grade ecommerce suite 
that is open-source (apache license), I wouldn't mind help on the 
lucene/search stuff.   There's plenty more for me to do.  120+ tables, 
going to prod for a client this weekend (without search;)  Contact me!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message