lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Gupta <>
Subject Exact search algorithm used in lucene
Date Thu, 09 Dec 2004 10:17:09 GMT
Hi developers,

    I am a new user/developer of lucene. I read Doug Cutting's paper
"Space Optimizations for Total Ranking". It has a number of algorithms for
searching in an index.

    I was curious which one(s) does lucene implement.

    Does it have something like "parallel merge" (Figure 4 in the paper)?
I think it wouldn't use simple inverted index search(Fig 2 in the paper)
because it is costlier (it takes O(N) space where N is number of documents
in the collection)?

    I have been able to drive lucene by nutch (thru eclipse java
debugger). I am trying to find out the search algorithm used for phrase
queries and regular queries. Which file has the code to read in the next
posting? If I could find the function which actually reads the indices
during search, then I could a breakpoint there and understand that code. I
have been able to follow a search upto this point

---- file---------
  public TopDocs search(Query query, Filter filter, final int nDocs)
       throws IOException {
    Scorer scorer = query.weight(this).scorer(reader);
    return new TopDocs(totalHits[0], scoreDocs);

I realized that the first line actually does the core search - i.e.
getting the list of relevant documents.

Scorer scorer = query.weight(this).scorer(reader);

Is that correct? Things get a little hazy after I step into this function.
Can you point me to what's happening with buckets, scorers, weights? Can
someone write a small paragraph about the basic strategy being used here?

Since, I am not fully sure about the big picture, i.e. the exact algorithm
being used - it is difficult to follow the code.

Also, I was curious if there is some sort of Getting started guide for
developers? The Gettting started docs and FAQs for lucene users is very

Thanks for reading this and your time.

 Vikas Gupta                   Email:
 Masters Student (Graduating in 2 weeks)
 Dept. of Computer Sciences,
 Univ. of Texas at Austin, USA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message