lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: BitSet Filter ArrayIndexOutOfBoundsException?
Date Wed, 15 Apr 2009 23:41:18 GMT
Use the index reader given to getDocIdSet. The Ids are only valid for that
index reader. This is new in Lucene 2.9: filters are executed against each
segment of an index separately, so the docids of the
MultiReader/DirectoryIndexReader are different to the local ones.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Ryan McKinley [mailto:ryantxu@gmail.com]
> Sent: Thursday, April 16, 2009 1:34 AM
> To: java-user@lucene.apache.org
> Subject: Re: BitSet Filter ArrayIndexOutOfBoundsException?
> 
> Are you saying there lucene document could have different ids in the
> MultiReader and the IndexReader?
> 
> I have assumed that the ids have not changed as long as the
> lastmodified time has not changed:
>    long lastmodified = IndexReader.lastModified( reader.directory() );
> Is this assumption correct?
> 
> I get the original ids using:
> 
>      SolrIndexSearcher searcher = ...
>      DocList docs = searcher.getDocList( new MatchAllDocsQuery(),
>          (DocSet)null, null, 0, Integer.MAX_VALUE );
> 
> and assume that nothing has changed as long as:
>     IndexReader.lastModified( searcher.getReader().directory() );
> has not changed.
> 
> Am I missing something?
> 
> If so, how would I get access to the docId expected by
> Filter#getDocIdSet()?
> 
> thanks!
> ryan
> 
> 
> On Apr 15, 2009, at 5:41 PM, Michael McCandless wrote:
> 
> > Maybe it's because you're using the MultiReader docID space but
> > getDocIdSet(IndexReader) expects you to use the docID space for that
> > IndexReader (ie, a single segment)?
> >
> > Mike
> >
> > On Wed, Apr 15, 2009 at 1:37 PM, Ryan McKinley <ryantxu@gmail.com>
> > wrote:
> >> I am working on a Filter that uses an RTree to test for inclusion.
> >> This
> >> Filter works great *most* of the time -- if the index is optimized,
> >> it works
> >> all of the time.  I feel like I am missing something basic, but not
> >> sure
> >> what it could be.
> >>
> >> Each time the reader opens (and the index has changed), I build an
> >> RTree
> >> from stored fields.  The RTree holds the lucene document ID and is
> >> later
> >> used in a Filter/Query.  This is how I build the RTree:
> >>
> >>  FieldSelector selector = new MapFieldSelector( new String[]
> >> { "extent" } );
> >>  DocIterator iter = docs.iterator();
> >>  while( iter.hasNext() ) {
> >>    int id = iter.nextDoc();
> >>    Document doc = searcher.doc( id, selector );
> >>    Fieldable ff = doc.getFieldable( "extent" );
> >>    if( ff != null && !reader.isDeleted( id ) ) {
> >>      ... add the id to the RTree ...
> >>    }
> >>  }
> >>
> >> In the Filter, I run query my RTree and add results to a BitSet
> >>
> >>  public DocIdSet getDocIdSet(IndexReader reader) throws IOException
> >>  {
> >>    final BitSet bits = new BitSet();
> >>
> >>    // ... query the RTree adding matching ids to the BitSet...
> >>      bits.set( id );
> >>
> >>    return new DocIdBitSet( bitset );
> >>  }
> >>
> >> When things go wrong, I get an error like this:
> >>
> >> java.lang.ArrayIndexOutOfBoundsException: 67
> >>     at org.apache.lucene.util.OpenBitSet.fastSet(OpenBitSet.java:242)
> >>     at
> >> org
> >> .apache
> >> .solr.search.DocSetHitCollector.collect(DocSetHitCollector.java:63)
> >>     at
> >> org.apache.lucene.search.IndexSearcher
> >> $MultiReaderCollectorWrapper.collect(IndexSearcher.java:313)
> >>     at org.apache.lucene.search.Scorer.score(Scorer.java:58)
> >>     at
> >> org.apache.lucene.search.IndexSearcher.doSearch(IndexSearcher.java:
> >> 262)
> >>     at
> >> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:250)
> >>     at org.apache.lucene.search.Searcher.search(Searcher.java:126)
> >>     at
> >> org
> >> .apache
> >> .solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:
> >> 691)
> >>     at
> >> org
> >> .apache
> >> .solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:597)
> >>     at
> >> org
> >> .apache
> >> .solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:633)
> >>     at
> >> org
> >> .apache
> >> .solr
> >> .search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:
> >> 1154)
> >>     at
> >> org
> >> .apache
> >> .solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:
> >> 924)
> >>     at
> >> org
> >> .apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:
> >> 345)
> >>     at
> >> org
> >> .apache
> >> .solr.handler.component.QueryComponent.process(QueryComponent.java:
> >> 171)
> >>
> >> I'm guessing it is referencing a deleted document or something like
> >> that,
> >> but I figured the:
> >>  && !reader.isDeleted( id ) clause would take care of that.
> >>
> >> Any pointers would be great!
> >>
> >> Thanks
> >> Ryan
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message