lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Galambos <Le...@seznam.cz>
Subject Re: Detailed information about searching,indexing technique
Date Tue, 08 Jul 2003 16:26:07 GMT
Ricardo Baeza-Yates, Berthier Ribeiro-Neto: Modern Information 
Retrieval, ACM Press, ISBN 0-201-39829-X

The searching phase is trivial, if you know the basic vector model.
The indexing phase is described on pp 196-199. It is a classic algorithm.

Your queries:
1 - see the archive.
2 - you cannot solve it AFAIK. BTW, you would rather play with the 
entropy than with frequencies.

-g-

clibois@student.fsa.ucl.ac.be wrote:

>Hello. I'm working in a recent company called Denali 
>which is interested by using Lucene. I have been 
>looking on the official website in order to get 
>information about this but i did'nt found any 
>explanation about how (in details) the index is create 
>and how the search is being made on it .
> In fact we would like to add two special query:
>-one which could find what are the most frequent term 
>in a document. 
>-one which could find what are the most frequent term 
>associated whith anoter term(for example: for a given 
>term "lucene", we will find "search","moteur","open 
>source",....)
>If somebody could indicate where I could find details 
>information not on "how to use Lucene" but "How does it 
>works in details?(algorithme used,...)", it would be 
>nice.
>Best regards
>Claude Libois
> 
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
>  
>




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message