lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <DCutt...@grandcentral.com>
Subject RE: RAMDirectory bug?
Date Fri, 07 Dec 2001 18:10:04 GMT
Your test case is not self-contained.  For example, you do not provide the
classes FourgenVendorInfo or VendorInfo.  So I cannot reproduce this.
Please provide a complete test case that can easily be compiled and run.

Thanks,

Doug

> -----Original Message-----
> From: Kristian Rickert [mailto:Kristian.Rickert@halo.com]
> Sent: Thursday, December 06, 2001 5:59 PM
> To: 'lucene-dev@jakarta.apache.org'
> Subject: RAMDirectory bug?
> 
> 
> I'm getting the classic bug "ArrayIndexOutOfBounds" when 
> performing a search
> in my RAMDirectory.
> 
> So I'm going to explain this one in full detail.. so forgive 
> me if this is a
> waste to you.  I really do think this is a re-surfaced bug 
> because when my
> RAMDirectory has fewer than ~2000 documents, the searching 
> works flawlessly.
> 
> So here goes:
> 
> 
> I used release candidate 2 and the nightly build and both of 
> these yield the
> same result...
> 
> I am having the same problem with the RAMDirectory search as 
> mentioned in
> message # ... here is a self-contained example that will 
> cause the same
> error.. I'll comment along the way so I'm not one of those 
> asses who says,
> "here's my code.. fix it!"
> 
> * First of all, I am using this application to index objects that just
> contain simple values.  I already use a RB tree to retrieve 
> the information,
> which is plenty quick for what I need.  
> * So, I am storing these as unstored and only indexed.  
> * Furthermore, I do not need to index the primary key, only 
> store it.. so my
> "VendorDocument" file goes as follows:
>   public static Document Document(FourgenVendorInfo fvi) {
>     // make a new, empty document
>     Document doc = new Document();
> 
>     doc.add(Field.UnStored("Address1", "" + fvi.getAddress1()));
>     doc.add(Field.UnStored("Address2", "" + fvi.getAddress1()));
>     doc.add(Field.UnStored("BusinessName", "" + fvi.getBus_name()));
>     doc.add(Field.UnStored("City", "" + fvi.getCity()));
>     doc.add(Field.UnStored("State", "" + fvi.getState()));
>     doc.add(Field.UnStored("CountryCode", "" +  
> fvi.getCountry_code()));    
>     doc.add(Field.UnStored("Fax", "" + fvi.getFax_phone()));
>     doc.add(Field.UnStored("Asi", "" + fvi.getHal_asi_no()));
>     doc.add(Field.UnStored("Phone", "" + fvi.getPhone()));
>     doc.add(Field.UnIndexed("VendorCode", "" + fvi.getVend_code()));
>     doc.add(Field.UnStored("Zip", "" + fvi.getZip()));
>     doc.add(Field.UnStored("PlusFour", "" +
> AddressParsers.parsePlusFour(fvi.getZip())));
>     return doc;
>   }
> * I have about 8000 of these documents to add to the index.  
> 
> 
> 
> First, I create RAMStorage:
>         RAMDirectory RAMStorage = new RAMDirectory();
>         //RAMStorage.createFile("Vendors");
>         IndexWriter indexer = null;
>         try {
> 
> create the IndexWriter with the RAMStorage, using my vendor 
> analyzer - which
> is a simplified form of simpleanalyzer (it doesn't ignore 
> digits, literally
> 3 letters different code than the SimpleAnalyzer)
>             indexer = new IndexWriter(RAMStorage, new 
> VendorAnalyzer(),
> true);
>             
> 
> I add the docuemnts to the indexer, optimize it and close it.
>             if (fviAllVendors != null) {
>                 for (int i = 0; i < fviAllVendors.length; i++) {
>                     Document currentDoc =
> VendorDocument.Document(fviAllVendors[i]);
>                     indexer.addDocument(currentDoc);
>                     //System.out.println(currentDoc.toString());
>                 }
>             }
>             indexer.optimize();
>             indexer.close();
> 
> Finally, I perform the search with the line "+State:mn":
> 	Query query = QueryParser.parse(line, "contents", analyzer);
> 	System.out.println("Searching for: " + 
> query.toString("contents"));
> 	Hits hits = searcher.search(query);
> 
> 
> 
> It is at this point where I get the array index out of bounds 
> exception.
> 
> Other facts to especially note:
> * This error only happens when there is a successful hit in 
> the search (this
> makes sense if you view the stack trace)
> * I have noticed that when I have an index size of ~2000, I 
> never get the
> thing to break.  Thus, I might just break this up into multple RAM
> directories as a hack fix, although I suspect it could be the data I'm
> providing
> * Wildcard querys work fine with the parser.
> * According to the stack trace, the error happens from a 
> readInternal()
> command within the RAMInputStream
> 
> Oh yeah, my environment:
> *Same error on NT 4.0 and Sun OS 7.
> *2 GB memory with a 100MB heap - nothing really taking up memory space
> 
> 
> 
> I worked on a search engine on my own and will be willing to 
> contribute if I
> find out the problem.  For now, I may just switch to a file 
> system search
> instead.  But this will probably be slower than if I 
> optimized the hell out
> of oracle and had that database do the trick for me.
> 
> I hope this will help.  Below is a list of the MAIN file I've 
> been using to
> test.  Also, you'll see a copy of the stack trace.
> 
> 
> 
> 
> import java.io.IOException;
> import java.io.BufferedReader;
> import java.io.InputStreamReader;
> 
> import org.apache.log4j.Category;
> import org.apache.lucene.store.RAMDirectory;
> import org.apache.lucene.index.*;
> import org.apache.lucene.analysis.Analyzer;
> import org.apache.lucene.analysis.SimpleAnalyzer;
> import org.apache.lucene.analysis.StopAnalyzer;
> import org.apache.lucene.document.Document;
> import org.apache.lucene.search.Searcher;
> import org.apache.lucene.search.IndexSearcher;
> import org.apache.lucene.search.WildcardQuery;
> import org.apache.lucene.search.Query;
> import org.apache.lucene.search.Hits;
> import org.apache.lucene.queryParser.QueryParser;
> 
> /**
>  *
>  * @author  FOOBAR
>  * @version 
>  */
> public class VendorSearchIndexTest {
> 
>     
> 
>     public static VendorInfo [] fviAllVendors;
> 
>     /**
>     * @param args the command line arguments
>     */
>     public static void main (String args[]) {
>         VendorInfo [] fviAllVendors =
> DB_FourgenVendor.retrieveFourgenVendors();
> 
>         //create a RAM directory
>         RAMDirectory RAMStorage = new RAMDirectory();
>         //I ran this with and without the line below
>         RAMStorage.createFile("Vendors");
> 
>         IndexWriter indexer = null;
>         try {
>             indexer = new IndexWriter(RAMStorage, new 
> VendorAnalyzer(),
> true);
>             
> 
>             if (fviAllVendors != null) {
>                 for (int i = 0; i < fviAllVendors.length; i++) {
>                     Document currentDoc =
> VendorDocument.Document(fviAllVendors[i]);
>                     indexer.addDocument(currentDoc);
>                     //System.out.println(currentDoc.toString());
>                 }
>             }
>             indexer.optimize();
>             indexer.close();
>             Searcher searcher = new IndexSearcher(RAMStorage);
>             Analyzer analyzer = new VendorAnalyzer();
>       
> 
>       BufferedReader in = new BufferedReader(new
> InputStreamReader(System.in));
>       while (true) {
> 	System.out.print("Query: ");
> 	String line = in.readLine();
> 
> 	if (line.length() == -1)
> 	  break;
>         //WildcardQuery query = new WildcardQuery(new 
> Term("+City", "LI*"));
> 	Query query = QueryParser.parse(line, "contents", analyzer);
> 	System.out.println("Searching for: " + 
> query.toString("contents"));
> 
> 	Hits hits = searcher.search(query);
> 	System.out.println(hits.length() + " total matching documents");
> 
> 	final int HITS_PER_PAGE = 10;
> 	for (int start = 0; start < hits.length(); start += 
> HITS_PER_PAGE) {
> 	  int end = Math.min(hits.length(), start + HITS_PER_PAGE);
> 	  for (int i = start; i < end; i++)
> 	    System.out.println(i + ". " + 
> hits.doc(i).get("VendorCode"));
> 	  if (hits.length() > end) {
> 	    System.out.print("more (y/n) ? ");
> 	    line = in.readLine();
> 	    if (line.length() == 0 || line.charAt(0) == 'n')
> 	      break;
> 	  }
> 	}
>       }
>       searcher.close();
>   
>             
>         } catch (Exception e) {
>             System.out.println("Exception.. what the?: " + 
> e.toString());
>         }
>   }}
> 
> --
> To unsubscribe, e-mail:   
<mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message