lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohammad Norouzi" <>
Subject Lucene indexes and relationship
Date Sat, 15 Sep 2007 05:15:28 GMT
In our application, we have many categories (indexes) in which different
kind of information have been indexed. we provided a facility for our users
to opt their category to search and we also provided a way that they select
more than one category to search, afterwards, we must return back the result
from selected categories in which they are related to each other and discard
other documents which their IDs are not common with the results from other
categories, in other words, an intersection of all results

currently, each category has an ID field and this help us to find which
document to select and which one to discard, in order to implement this, we
defined a HitCollector in which, we collect all the Lucene's document IDs to
load the document later as well as all category ID to find its relationship
with other result. we store these IDs in a java.util.BitSet  structure to
save the memory and also for performance issues.
now this works fine, but it's messy, further more, to find all common
documents we have to compare all the results two by two and this is really
exhaustive and eat up the CPU resources and this deprived us from the
Lucene's speed of searching.

In my opinion, if this simple relationship could be embedded inside the
Lucene Api at low level sections of loading documents, this process would be
done very fast, and I think this feature will make many Lucene's user happy

I want to know, whether this is possible for Lucene developers or not, or it
is in their TO DO list or they are going to provide it in future.

thank you all Lucene developers and producers.

Mohammad Norouzi
see my blog:
another in Persian:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message