lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject RE: search alogorithm in Lucene
Date Mon, 08 Aug 2005 18:51:19 GMT
If you need to index XML with Lucene, you can look at my article about
using Digester+Lucene to parse+index XML documents.  The article can be
found on the IBM developerWorks site.
You can also look at the code that comes with Lucene in Action where we
show how to parse with Digester and SAX 2.0 API, and index with Lucene.
 Chapter 7, I believe.

Otis


--- Rajesh Munavalli <rajeshm@dessci.com> wrote:

> Lucene considers text documents only. If you use the standard
> analyzer
> all the contents in the document will be parsed the same way. To
> index
> XML document you need to come up with your own Analyzer/Tokenizer
> which
> separates XML tags and indexes accordingly. I guess you want to
> preserve
> the meta-data contained in the XML document.
> 
> --
> Rajesh Munavalli 
> 
> -----Original Message-----
> From: Madhu Panitini [mailto:Madhu.Panitini@pass-consulting.com] 
> Sent: Monday, August 08, 2005 12:17 PM
> To: general@lucene.apache.org
> Subject: RE: search alogorithm in Lucene
> 
> Hi one more question
> 
> Is there any format of text file that lucene eexpects some think like
> addition of XML tags for the text document which is given for lucene
> before indexing.
> 
> regards
> madhu
> 
> -----Original Message-----
> From: Madhu Panitini
> Sent: Monday, August 08, 2005 7:02 PM
> To: general@lucene.apache.org
> Subject: search alogorithm in Lucene
> 
> Hi all,
> I new to the lucene, but I am familiar with the IR. I want build IR
> system in Java and I found Lucene, but some questions remained
> unanswered for me after searching complete website. 
> 
> I have couple of questions regarding Lucene, 
> 
> 1. What is the search algorithm(s)[VSM, ..] used or available in the
> Lucene?
> 
> 2. How term weight is calculated in Lucene, how many types of term
> weight calculating formulas are implemented and what are they?
> 
> Regards
> Madhu
> 
> 
> 
> 
> 


Mime
View raw message