lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Miyamoto" <moto.k...@gmail.com>
Subject Re: extending SolrIndexSearcher
Date Wed, 10 May 2006 07:44:15 GMT
Hi Chris,

My last email msg was in response to your suggestion:

> If it's a Lucene class, you may want to start by making a small proof
> of concept RMI app that just uses the Lucene core classes, once that
> works then try your changes in Solr.

For which I agree is a good starting point to narrow things down.  So my
last msg was actual code of non-solr testing of ParallelMultiSearcher with
RMI calls.

As for actual solr code modification, the following are the relevant pieces:

// approximately line 65, the constructor:
// SolrIndexSearcher class attributes:
// this was the original:
// private final IndexSearcher searcher;
// replaced with:
private final ParallelMultiSearcher searcher;


// approximately line 123, the constructor:
private SolrIndexSearcher(IndexSchema schema, String name, IndexReader r,
boolean closeReader, boolean enableCache) throws Exception {
    this.schema = schema;
    this.name = "Searcher@" + Integer.toHexString(hashCode()) + (name!=null
? " "+name : "");

    log.info("Opening " + this.name);

    reader = r;

    // this is the original:
    //searcher = new IndexSearcher(r);
    // replaced with:
    searcher = _initSearcher();
....
}

// and i added this to initialize searcher:
private ParallelMultiSearcher _initSearcher() throws Exception {

      Searchable[] sch = new Searchable[3];

      // local indexes that are searchable..
      for (int i=0; i<2; i++) {
         sch[i] = new IndexSearcher("/disk" + i);
      }

      // a remote searchable available via RMI
      sch[2] = (Searchable) Naming.lookup("//somehost.com:1099/searchit");

      ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch);
      return searcher;
}

>From this src code modification, I do an 'ant compile', repackage solr.war,
install it in the appropriate location, start up the example ('java -jar
start.jar'), then submit search queries via curl.

Then I submit a simple curl from cmd line:

curl http://localhost:8080/solr/select -d version="2.1" -d start=0 -d
rows=10 -d indent=on -d submit=search -d q="body:blablabla"

Without the RMI as a searchable, the search works just fine,  With the RMI
as a searchable, I get an exception:

java.rmi.MarshalException: error marshalling arguments; nested exception is:

        java.io.NotSerializableException:
org.apache.lucene.search.ParallelMultiSearcher$1
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
        at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown
Source)
        at org.apache.lucene.search.ParallelMultiSearcher.search(
ParallelMultiSearcher.java:172)
        at org.apache.lucene.search.Searcher.search(Searcher.java:116)
        at org.apache.lucene.search.Searcher.search(Searcher.java:95)
        at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
SolrIndexSearcher.java:794)
        at org.apache.solr.search.SolrIndexSearcher.getDocListC(
SolrIndexSearcher.java:712)
        at org.apache.solr.search.SolrIndexSearcher.getDocList(
SolrIndexSearcher.java:605)
        at org.apache.solr.request.StandardRequestHandler.handleRequest(
StandardRequestHandler.java:106)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:585)
        at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:80)
        at org.apache.solr.servlet.SolrServlet.doPost(SolrServlet.java:70)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:767)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:860)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java
:408)
        at org.mortbay.jetty.servlet.ServletHandler.handle(
ServletHandler.java:350)
        at org.mortbay.jetty.servlet.SessionHandler.handle(
SessionHandler.java:195)
        at org.mortbay.jetty.security.SecurityHandler.handle(
SecurityHandler.java:164)
        at org.mortbay.jetty.handler.ContextHandler.handle(
ContextHandler.java:536)

Looking at the last place on the src code for SolrIndexSearcher.java (line
794), this is the source code it threw from a search call with a newly
defined HitCollector:

    searcher.search(query, new HitCollector() {
      float minScore=Float.NEGATIVE_INFINITY;  // minimum score in the
priority queue
      public void collect(int doc, float score) {
        if (filt!=null && !filt.exists(doc)) return;
        if (numHits[0]++ < lastDocRequested || score >= minScore) {
          // if docs are always delivered in order, we could use
"score>minScore"
          // but might BooleanScorer14 might still be used and deliver docs
out-of-order?
          hq.insert(new ScoreDoc(doc, score));
          minScore = ((ScoreDoc)hq.top()).score;
        }
      }
    }

If I follow the exception trail, within Lucene it's
(repeated from above for context)

at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown Source)
        at org.apache.lucene.search.ParallelMultiSearcher.search(
ParallelMultiSearcher.java:172)
        at org.apache.lucene.search.Searcher.search(Searcher.java:116)
        at org.apache.lucene.search.Searcher.search(Searcher.java:95)

which has the following src code:

Searcher.java:95
public void search(Query query, HitCollector results)
    throws IOException {
  search(query, (Filter)null, results);
}

Searcher.java:116
public void search(Query query, Filter filter, HitCollector results)
    throws IOException {
  search(createWeight(query), filter, results);
}

ParallelMultiSearcher.java:172
public void search(Weight weight, Filter filter, final HitCollector results)
    throws IOException {
  for (int i = 0; i < searchables.length; i++) {

    final int start = starts[i];

>>> HERE:    searchables[i].search(weight, filter, new HitCollector() {
        public void collect(int doc, float score) {
          results.collect(doc + start, score);
        }
      });
  }
}

I'm wondering if it is a failure to deal with the HitCollector.  Any ideas?

thanks,
Koji


On 5/9/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
>
>
> : IndexSearcher.  I replaced it with ParallelMultiSearcher, where it is
> : initialized similar to the client code I mentioned above.
> :
> : >From that, it seems like Solr itself needs to marshall and unmarshall
> the
> : searcher instance SolrIndexSearcher holds, and because the
> : ParallelMultiSearcher is initialized with RMI stubs, it fails to proceed
> : with such marshall/unmarshall internal actions.  As mentioned in the
> first
> : email, if I use ParallelMultiSearcher to only look at local indexes (no
> RMI
> : stub), Solr works just fine.  So I'm wondering if there is a way use
> : SolrIndexSearcher to search both local and remote indexes, even if not
> : through the RMI solution Lucene's ebook has suggested via its
> : ParallelMultiSearcher class.
>
> As I said, i don't really know a lot about RMI, but I don't think the
> client code is expected to marshall/unmarshall things -- but the objects
> you want to pass to remote methods (or recieve back from from remote
> methods) need to be serializable.  Do you know what objects you got
> serialization exceptions from? (you didn't include any real source -- just
> psuedocode, so it's not posisble to use the line numbers in your stack
> trace to look at the code because we don't know exactly what you changed)
>
>
>
> -Hoss
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message