lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <karl.wri...@nokia.com>
Subject ArrayIndexOutOfBounds exception using FieldCache
Date Wed, 27 Oct 2010 13:21:41 GMT
Hi Folks,

I just tried to index a data set that was probably 2x as large as the previous one I'd been
using with the same code.  The indexing completed fine, although it was slower than I would
have liked. ;-)  But the following problem occurs when I try to use FieldCache to look up
an indexed and stored value:

java.lang.ArrayIndexOutOfBoundsException: -65406
        at org.apache.lucene.util.PagedBytes$Reader.fillUsingLengthPrefix(PagedBytes.java:98)
        at org.apache.lucene.search.FieldCacheImpl$DocTermsImpl.getTerm(FieldCacheImpl.java:918)
        at ...

The code that does this has been working for quite some time and has been unmodified:

    /** Find a string field value, given the lucene ID, field name, and value.
    */
    protected String getStringValue(int luceneID, String fieldName)
      throws IOException
    {
      // Find the right reader
      final int idx = readerIndex(luceneID, starts, readers.length);
      final int docBase = starts[idx];
      final IndexReader reader = readers[idx];

      BytesRef ref = FieldCache.DEFAULT.getTerms(reader,fieldName).getTerm(luceneID-docBase,new
BytesRef());
      String rval = ref.utf8ToString();
      //System.out.println(" Reading luceneID "+Integer.toString(luceneID)+" field "+fieldName+"
with result '"+rval+"'");
      return rval;
    }

  }

I added a try/catch to see what values were going into the key line:

catch (RuntimeException e)
    {
        System.out.println("LuceneID = "+luceneID+", fieldName='"+fieldName+"', idx="+idx+",
docBase="+docBase);
        System.out.println("Readers = "+readers.length);
        int i = 0;
        while (i < readers.length)
            {
                System.out.println(" Reader start "+i+" is "+starts[i]);
                i++;
            }
        throw e;
    }

The resulting output was:

LuceneID = 34466856, fieldName='id', idx=0, docBase=0
Readers = 1
     Reader start 0 is 0

... which looks reasonable on the face of things.  This is a version of trunk from approximately
8/12/2010, so it is fairly old.  Was there a fix for a problem that could account for this
behavior?  Should I simply synch up?  Or am I doing something wrong here?  The schema for
the id field is:

<fieldType name="string_idx" class="solr.StrField" sortMissingLast="true" indexed="true"
stored="true"/>
<field name="id" type="string_idx" required="true"/>

Karl


Mime
View raw message