lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Fischer <>
Subject Different fields in the same and index and query boosting
Date Sun, 26 Feb 2006 10:20:44 GMT
Hash: SHA1


1) Many different fields because of different projects?

I'm accessing Lucenes index on a remote server through XML-RPC and I'ld
like to use one index for completely independent search projects. The
number of query requests is not that high (just a few per minute) so
from performance wise I wouldn't see a problem.

When adding documents to an index in this case I would add e.g. 5000
documents with fields like "name", "author", "excerpt" and then also a
batch of 2000 files with other fields like "title", "front", "back".

Before adding a batch of thousand files for one project I would first
remove them. Common for all projects would be one "key" field by which I
would differentiate the projects, the query terms are so crafted that
key is always set as an AND field. They key has values like "project1"
and "project2".

Now my concern is: the more projects I add the more different fields
would come into play. I would not recreate the index from scratch as I'm
doing right now but I would only remove all documents with e.g. key
"project1" and add the new documents completely but not touching other

- From what I read from the documentation, as long as I would call
optimized() after every batch of change it should be ok.

2) Query boosting
Currently I was using query boosting extensive for the headings in HTML
documents, e.g. title:(term)^8 h1:(term)^7 ... h6:(term)^2
content:(term)^1 . I was wondering if this is actually necessary. The
number of existing h1 to h6 fields with content decreases with the
amount of documents. To give the fields title and h1, which are the most
used ones anyway, the highest importance, to I need the boost factor
here anyway or can I avoid them?

thanks for any answers,
- - Markus
Version: GnuPG v1.4.2.1 (MingW32)
Comment: Using GnuPG with Mozilla -


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message