lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Help to solve an issue when upgrading Lucene-Oracle integration to lucene 2.3.1
Date Wed, 07 May 2008 13:51:14 GMT

I didn't see any attachments on this email?  (I was expecting  
the .trc file so I could look at the full infoStream output).

Mike

Marcelo Ochoa wrote:
> Hi Michael:
>   First thanks a lot for your time.
>   See comments below.
>>  Is there any way to capture & serialize the actual documents being
>>  added (this way I can "replay" those docs to reproduce it)?
>   Documents are a column VARCHAR2 from all_source Oracle's System
> view, in fact is a table as:
> create table test_source_big as (select * from all_source);
>>
>>  Are you using threads?  Is autoCommit true or false?
>   Oracle JVM uses by default a single Thread model, except that Lucene
> is starting a parallel Thread. InfoStream information shows only one
> Thread.
>   AutoCommit is false.
>   I am creating LuceneWritter with this code:
>         IndexWriter writer = null;
>         Parameters parameters = dir.getParameters();
>         int mergeFactor =
>             Integer.parseInt(parameters.getParameter("MergeFactor",
>                                                      "" +
> LogMergePolicy.DEFAULT_MERGE_FACTOR));
>         int maxBufferedDocs =
>             Integer.parseInt(parameters.getParameter 
> ("MaxBufferedDocs",
>                                                      "" +
> IndexWriter.DEFAULT_MAX_BUFFERED_DOCS));
>         int maxMergeDocs =
>             Integer.parseInt(parameters.getParameter("MaxMergeDocs",
>                                                      "" +
> LogDocMergePolicy.DEFAULT_MAX_MERGE_DOCS));
>         int maxBufferedDeleteTerms =
>             Integer.parseInt(parameters.getParameter 
> ("MaxBufferedDeleteTerms",
>                                                      "" +
>
> IndexWriter.DEFAULT_MAX_BUFFERED_DELETE_TERMS));
>         Analyzer analyzer = getAnalyzer(parameters);
>         boolean useCompountFileName =
>             "true".equalsIgnoreCase(parameters.getParameter 
> ("UseCompoundFile",
>                                                             "false"));
>         boolean autoTuneMemory =
>             "true".equalsIgnoreCase(parameters.getParameter 
> ("AutoTuneMemory",
>                                                             "true"));
>         writer =
>                 new IndexWriter(dir, autoCommitEnable, analyzer,  
> createEnable);
>         if (autoTuneMemory) {
>             long memLimit =
> ((OracleRuntime.getJavaPoolSize()/100)*50)/(1024*1024);
>             logger.info(".getIndexWriterForDir - Memory limit for
> indexing (Mb): "+memLimit);
>             writer.setRAMBufferSizeMB(memLimit);
>         } else
>             writer.setMaxBufferedDocs(maxBufferedDocs);
>         writer.setMaxMergeDocs(maxMergeDocs);
>         writer.setMaxBufferedDeleteTerms(maxBufferedDeleteTerms);
>         writer.setMergeFactor(mergeFactor);
>         writer.setUseCompoundFile(useCompountFileName);
>         if (logger.isLoggable(Level.FINE))
>             writer.setInfoStream(System.out);
>    The example pass these relevant parameters:
>     
> AutoTuneMemory:true;LogLevel:FINE;Analyzer:org.apache.lucene.analysis. 
> StopAnalyzer;MergeFactor:500
>    So, because AutoTuneMemory is true, instead of setting
> MaxBufferedDocs I am setting RAMBufferSizeMB(53) which is calculated
> using Oracle SGA free memory.
>>
>>  Are you using payloads?
>   No.
>>
>>  Were there any previous exceptions in this IndexWriter before  
>> flushing
>>  this segment?  Could you post the full infoStream output?
>   There is no provious exception. Attached a .trc file generated by
> Oracle 11g, it have infoStream information plus logging informartion
> from Oracle-Lucene data cartridge.
>>
> <snip>
>>  Could you apply the patch below & re-run?  It will likely produce
>>  insane amounts of output, but we only need the last section to see
>>  which term is hitting the bug.  If that term consistently hits  
>> the bug
>>  then we can focus on how/when it gets indexed...
>   I'll patch my lucene-2.3.1 source and send again the .trc file.
>   Also, I am comparing FSDirectory implementation (2.3.1) with my
> OJVMDirectory implementation to see changes on how the API of
> BufferedIndex[Input|Output].java is used, may be here is the problem.
>   For example latest implementation wait an IOException when open an
> IndexInput and a file doesn't exists, my code throw a RuntimeException
> wich works with Lucene 2.2.x but doesn't work with 2.3.1, this was the
> first change to get Lucene-Oracle integration working.
>   Best regards. Marcelo.
> -- 
> Marcelo F. Ochoa
> http://marceloochoa.blogspot.com/
> http://marcelo.ochoa.googlepages.com/home
> ______________
> Do you Know DBPrism? Look @ DB Prism's Web Site
> http://www.dbprism.com.ar/index.html
> More info?
> Chapter 17 of the book "Programming the Oracle Database using Java &
> Web Services"
> http://www.amazon.com/gp/product/1555583296/
> Chapter 21 of the book "Professional XML Databases" - Wrox Press
> http://www.amazon.com/gp/product/1861003587/
> Chapter 8 of the book "Oracle & Open Source" - O'Reilly
> http://www.oreilly.com/catalog/oracleopen/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message