lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Summer of Code idea for lucene
Date Sat, 13 Sep 2008 14:48:19 GMT
Hey Joaquin,

Your work here looks very interesting. The Lucene community has shown a 
strong interest in this area before (see LUCENE-965).

I see you went with an lgpl license though. This might be a bit of a 
barrier in getting feedback from a community based on apache license 
software. Obviously, there still might be interest,learning, and an 
exchange of ideas, but none of your code can be distributed with Lucene, 
and so what you have done loses some of its appeal in that sense. Is 
there any chance you would be willing to relax the license, possibly 
gaining more feedback, contributors, and possible inclusion in Lucene? 
Certainly not necessary to receive feedback, but I think it would help 
-- I'd certainly be looking closer anyway.

- Mark

Joaquin Perez Iglesias wrote:
> Hi all,
>
> finally I got some time to finish the BM25/BM25F implementation for 
> Lucene you can find more details at 
> http://nlp.uned.es/~jperezi/Lucene-BM25/, it has been tested but I 
> cannot assure that is bugs free.
> It would be great to receive some feedback about it.
>
> There are some details about the implementation that I consider will 
> be of interest,as how to calculate the average_length or  idf at 
> document level.
> Please if you find any bug or mistake in the supplied implementation 
> let me know and I will try to solve it, same for questions.
>
> Hope that some of you will find useful.
>
> Thanks in advance.
>
>
>
> joaquin.perez@lsi.uned.es escribió:
>> Hi Otis,
>>
>> as my colleague said, we have a first implementation of BM25 over 
>> Lucene, this development is part of a (almost finished) thesis 
>> project that compares different IR models, over an standard 
>> collection. At the same time we are trying to extend this first 
>> implementation in order to support BM25F for multifield queries, 
>> unfortunately at this time we are too busy to prepare a final version 
>> of this code, so we will have to finish this code over the summer 
>> (hopefully we will have more time :-))), and make it public at this 
>> time.
>>
>> We will inform to this list when we will finish the preparation of a 
>> final version.
>>
>> Thanks to everybody for the interest!!!
>>
>> Bye
>> Joaquin
>>
>> -----------------------------------------------------------
>> Joaquín Pérez Iglesias
>> Dpto. Lenguajes y Sistemas Informáticos
>> E.T.S.I. Informática (UNED)
>> Ciudad Universitaria
>> C/ Juan del Rosal nº 16
>> 28040 Madrid - Spain
>> Phone. +34 91 398 87 25
>> Fax    +34 91 398 65 35
>> Office  2.07
>> Email: joaquin.perez@lsi.uned.es
>> ----------------------------------------------------------- Otis 
>> Gospodnetic <otis_gospodnetic@yahoo.com> escribe :
>>
>>  
>>> Hi Jose,
>>>
>>> I was wondering if you ever got to this.  I would love to see and 
>>> try BM25 for
>>> Lucene!
>>>
>>>
>>> I'm looking at http://code.google.com/soc/2008/asf/about.html
>>> and it looks like this didn't make it into GSoC, but this would 
>>> still be great
>>> to have.
>>>
>>> Thanks,
>>> Otis
>>> -- 
>>> Sematext -- http://sematext.com/ --
>>> Lucene - Solr - Nutch
>>>
>>>
>>> ----- Original Message ----
>>>    
>>>> From: José Ramón Pérez Agüera <jose.aguera@gmail.com>
>>>> To: java-dev@lucene.apache.org;
>>>>       
>>> Joaquin Perez-Iglesias <joaquin.perez.iglesias@gmail.com>
>>>    
>>>> Sent: Saturday, March 15, 2008 4:54:08 AM
>>>> Subject: Re: Summer of Code idea for lucene
>>>>
>>>> we have almost implemented BM25 using lucene structure, but we need
>>>> help to finish query parser and other details. If you o somebody want
>>>> We can send you the code and you can help us to implement the query
>>>> parser and prepare the code to sandbox.
>>>>
>>>> If there are people interested I can made a web page for the project
>>>> and put our implementatio to download
>>>>
>>>> Somebody is interested?
>>>>
>>>> jose
>>>>
>>>> -- 
>>>> José Ramón Pérez Agüera
>>>>
>>>> Dept. de Ingeniería del Software e Inteligencia Artificial
>>>> Despacho 411 tlf. 913947599
>>>> Facultad de Informática
>>>> Universidad Complutense de Madrid
>>>>
>>>> On Sat, Mar 15, 2008 at 5:32 AM, Ian Holsman wrote:
>>>>      
>>>>> If no one objects (I don't think it's too late)
>>>>>
>>>>>  would you mind a GSOC project to implement BM25
>>>>>         
>>> relevancy/scoring?
>>>     
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>     
>>
>> ________________________________________________
>> Servicio WebMail de CiberUNED http://www.uned.es
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>   
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message