Hi Madhu,
>1. What is the search algorithm(s)[VSM, ..] used or available in the
Lucene?
Lucene uses Vector Space Model as its retrieval model.
>2 How term weight is calculated in Lucene, how many types of term
weight calculating formulas are implemented and what are they?
TFIDF weighting is used with some modifications to how the raw score is
computed. You can refer to "Lucene In Action" by Otis Gospodnetic and
Erik Hatcher" page 78. Here is the formula
= SUMMATION {tf(t in d).idf(t).boost(t.field in d). lengthNorm(t.field
in d)
t in q
There are variations in how you can compute this score. For example
by setting different boost level while indexing or searching. But the
basic scoring is still based on TFIDF weighting.
Hope it helps.
Rajesh Munavalli
Original Message
From: Madhu Panitini [mailto:Madhu.Panitini@passconsulting.com]
Sent: Monday, August 08, 2005 12:02 PM
To: general@lucene.apache.org
Subject: search alogorithm in Lucene
Hi all,
I new to the lucene, but I am familiar with the IR. I want build IR
system in Java and I found Lucene, but some questions remained
unanswered for me after searching complete website.
I have couple of questions regarding Lucene,
1. What is the search algorithm(s)[VSM, ..] used or available in the
Lucene?
2. How term weight is calculated in Lucene, how many types of term
weight calculating formulas are implemented and what are they?
Regards
Madhu
