lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ching-Pei Hsing" <cphs...@comergent.com>
Subject How to do refined search based on attributes and never return zero results
Date Wed, 07 Dec 2005 23:51:33 GMT
Has anyway solved the following problem, or have good suggestions?

 

Each document is assigned to one or more category nodes in a hierarchy.

For example,

 

Document1: /Computer/Desktop,

Document2: /Computer/Notebook; /Salesforce/ExtremePortable

Document3: /Computer/Server

......

 

For each search operations, not only a list of documents hit is
presented but a list of categories containing those documents as well as
the count of documents are also computed

 

/Computer/Desktop(30)

/Computer/Notebook(12)

/Computer/Accessories(51)

 

One can see this really useful because it can "guide" the user while
refining the search criteria and quickly reduce the size of the result.
I know we can do this, by brut force, by going through the entire result
set, retrieving data for the category field and start aggregating and
counting. It's not scalable though if the number of documents needs to
go through is high. It can create performance issues under load if each
execution thread held on to the index reader for too long (due to the
number of documents needs to go through).

 

Is there any API or approach we can leverage at search time? Is there
anything we can do at the indexing time? Or, is there any technology we
need to integrate, like those for data warehousing? Any comments or
pointers will be greatly appreciated.

 

Thanks

 

Ching-pei

 

 

 

 

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message