lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vermansi <>
Subject Cluster Retrieval in Lucene
Date Thu, 25 Nov 2010 17:49:19 GMT

I wish to implement cluster based retrieval model in lucene. I havent gone
through the code fully and am unaware of any existing implementations for it
based on lucene. 
Could someone give me a heads up on where to begin .. as there is too much
of code to go through and I have very little time. 

Now my idea is .. 

Lucene index should be created in form of clusters . Ie At indexing time
each Document (D) could belong to a cluster. 
On Query (Q) submission the each cluster is searched for relevant documents.
And the documents from that cluster as well as other clusters are ranked. 

A brute force way of implementing it could be 
1. Clusters are denoted by a field Name -- cluster (C).
2. the words are search in cluster field. 
3. The scoring functions are changed to incorporate the math used in cluster
4. Documents in each cluster ranked seperately. And then merged

Now the problem with this approach is many queries will have to be created
and the result processing will increase considerably...
If there are more ways to doing it please lemme know .


View this message in context:
Sent from the Lucene - General mailing list archive at

View raw message