lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Bruch <br...@cs.tu-darmstadt.de>
Subject Re: Using Lucene with a rather simplistic scoring system?
Date Tue, 06 Jul 2010 07:43:32 GMT
Hi all.

to close my thread:

My requirement was to build a simple scoring system that basically 
reuses Lucene's index infrastructure but not its advanced scoring 
system, i.e., I had to replace the query and scorer infrastructure with 
my own implementation.

In detail I had to come up with my own versions of:

    * ExamplesSearchQuery extends Query
    * ExamplesSearchWeight extends Weight
    * ExamplesScorer extends Scorer

The usage of this API looks similar to standard Lucene usage:

ExamplesSearchQuery query = LuceneQueryUtil.toCodeSearchQuery(request);
IndexSearcher searcher = new IndexSearcher(luceneIndexReader);
TopDocs search = searcher.search(query, 15);
...

To accomplish my scoring function, the scorer delegates to several 
subscorers similar to the BooleanScorer and sums up the sub-scores to 
build the final score.

Since yesterday, the first prototype based on Lucene is available for 
download and a blog-post showing the Eclipse integration in action is 
available here:
http://code-recommenders.blogspot.com/2010/07/why-is-google-codesearch-not-google-for.html

- and I would be glad to take your comments on the post :-)

Thanks again for your help on Lucene,
Marcel


On 11.06.2010 18:12, wrote Marcel Bruch:
> [...]
>> Sounds like an interesting project.
> I think it is :-) It's tightly integrated into Eclipse where (i) it 
> grabs your code of your current editor on demand, (ii) automatically 
> creates and submits a query from it, and (iii) displays the best code 
> examples directly aside your editor in a separate view. As soon as 
> scoring works with Lucene you can test the first version in a week or 
> two.
>
> However, can I implement this scoring function with Lucene:
> score(d,q) = \sum_{i \in I} w_i * f_i(d,q)


Mime
View raw message