lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Wang" <john.w...@gmail.com>
Subject Re: merge factor and real time indexing
Date Tue, 30 May 2006 23:30:51 GMT
Hi Otis:

    Thanks for your reply.

    The problem is the setting of maxBufferedDocs (see the code snippet from
my previous email). I wouldn't think it would affect things but it does.

-John

On 5/30/06, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
>
> John,
> I can't spot anything wrong in the code.  I assume you are certain there
> are no IOException, no issues with locks, and that your writer is really
> getting close()d.  I haven't used IndexModifier, and since that's a layer on
> top of the real IndexWriter with the same API exposed, I would just use
> IndexWriter directly and see if that makes a difference.  If it does, then
> maybe there is a bug in IndexModifier.
>
> Otis
>
> ----- Original Message ----
> From: John Wang <john.wang@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Tuesday, May 30, 2006 5:45:39 AM
> Subject: merge factor and real time indexing
>
> Hi folks:
>
>     I am working on an application that requires real time indexing, e.g.
> for every insert, I open the writer, add a document and then closes the
> writer.
>
>     I want to control the number of files created, and according to the
> documentation, a small mergeFactor is desired. However, I am experiencing
> the opposite, see the following code segment:
>
> public static void main(String[] args) throws IOException{
>         int mfactor=10;
>         int mbuffer=1000;
>
>
>         IndexModifier writer=null;
>         File dir=new File("/tmp/john/");
>
>         long start=System.currentTimeMillis();
>         for (int i=0;i<5000;++i){
>             try{
>                 boolean create=!IndexReader.indexExists(dir);
>                 writer=new IndexModifier(dir,new
> StandardAnalyzer(),create);
>                 writer.setMergeFactor(mfactor);
>                 writer.setMaxBufferedDocs(mbuffer);
>                     Document doc=new Document();
>                     doc.add(new Field("test","this is a test doc",
> Field.Store.YES,Field.Index.TOKENIZED,Field.TermVector.YES));
>                     writer.addDocument(doc);
>             }
>             finally{
>                 if (writer!=null){
>                     writer.close();
>                 }
>             }
>
>         }
>         long end=System.currentTimeMillis();
>
>         System.out.println("took: "+(end-start));
>
>     }
>
> If I set the mfactor value to a high number, e.g. 1000, indexing takes
> much
> longer but the number of files decreases dramatically.
>
> Is this expected or are there any better ways of tuning the indexing
> parameters so that I limit the number of open files while gettting a
> decent
> indexing speed?
>
> Thanks
>
> -John
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message