lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject RE: Notes on distributed searching with Lucene
Date Mon, 25 Mar 2002 21:24:36 GMT
> From: Mark Harwood []
> I have written up some of my experiences with creating a 
> distributed system 
> with Lucene here:
> It includes some UML interaction diagrams that I found useful 
> in understanding 
> the Lucene codebase.


It's great to see someone experimenting with this.  I originally had
distributed searching in mind when I wrote Lucene, but never quite got to
adding it.  A message that mentions some of these intentions is at:

A less "chatty" interface than the one mentioned there might be:

  public interface Searchable {
    public class TermStatistics implements Serializable {
      public int[] docFreqs;
      public int maxDoc;
    int getTermStatistics(Term[] terms) throws IOException;
    TopDocs search(Query query, Filter filter, int n) throws IOException;
    Document[] getDocs(int[] i) throws IOException;

With these three phases (collect term statistics, get doc id scores, get
docs) the results should be identical to searching the indexes locally with
MultiSearcher.  It sounded like your experiments skipped the first phase.

Probably it would be worth writing a MultiThreadSearcher that spawns a
thread for each sub-search, then waits for all to finish before merging the

So, if you are able to work on this more, it would be great to figure out
what it would take to make Query serializable, to convert the Searcher
implementations to use the above interface in place of the existing similar
abstract methods, and finally to implement an RMI-based RemoteSearcher.


To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message