lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "peter velthuis" <peter...@gmail.com>
Subject Re: Indexing very slow.
Date Mon, 03 Jul 2006 10:57:24 GMT
hehe that works.. its now racing thourgh 10 000 docs in a couple seconds :)

2006/7/3, Aleksander M. Stensby <aleksander.stensby@integrasco.no>:
> Ah, didnt see that, yeah, you should have something like
>
> new IndexWriter..
>
>         for each document, writer.add
>
> writer.optimize()
> writer.close()
>
> batching it up will make it faster, yes
>
> On Mon, 03 Jul 2006 11:43:03 +0200, Volodymyr Bychkoviak
> <vbychkoviak@i-hypergrid.com> wrote:
>
> > Problem is hidden in these lines:
> >  >       writer.optimize();
> >  >       writer.close();
> >
> > You should keep one index writer open for all document additions and
> > close it only after adding last document.
> >
> > Optimize() merges all index segments to single segment and as index
> > grows it takes longer and longer. Optimizing has no impact on indexing
> > so it should be performed once at the end of indexing to optimize index
> > for searching.
> >
> > peter velthuis wrote:
> >> When i start the program its fast.. about 10 docs per second. but
> >> after about 15000 it slows down very much. Now it does 1 doc per
> >> second and it is at nr# 40 000 after a whole night indexing. These are
> >> VERY small docs with very little information.. THis is what and how i
> >> index it:
> >>
> >>      Document doc = new Document();
> >>                                 doc.add(new Field("field1", field1,
> >> Field.Store.YES,
> >>                        Field.Index.TOKENIZED));
> >>                                 doc.add(new Field("field2", field2,
> >> Field.Store.YES,
> >>                        Field.Index.TOKENIZED));
> >>                                 doc.add(new Field("field3", field3,
> >> Field.Store.YES,
> >>                        Field.Index.TOKENIZED));
> >>                                 doc.add(new Field("field4", field4,
> >> Field.Store.YES,
> >>                        Field.Index.TOKENIZED));
> >>                                 doc.add(new Field("field5", field5,
> >> Field.Store.YES,
> >>                        Field.Index.TOKENIZED));
> >>                                doc.add(new Field("field6", field6,
> >> Field.Store.YES,
> >>                        Field.Index.TOKENIZED));
> >>                                doc.add(new Field("contents",
> >> contents, Field.Store.NO,
> >>                        Field.Index.TOKENIZED));
> >>
> >>
> >>
> >> and this:
> >>
> >>
> >>    String indexDirectory = "lucdex2";
> >>
> >>    private void indexDocument(Document document) throws Exception {
> >>        Analyzer analyzer  = new StandardAnalyzer();
> >>        IndexWriter writer = new IndexWriter(indexDirectory, analyzer,
> >> false);
> >>      //  writer.setUseCompoundFile(true);
> >>        writer.addDocument(document);
> >>        writer.optimize();
> >>        writer.close();
> >>
> >>
> >>
> >> I read the data out mysql database.. but that cant be the problem..
> >> since data is in memory.
> >>
> >> Also i use cygwin, when i try indexing on windows in a program like
> >> netbeans or BlueJ it crashes windows after about 5000 docs. it sais
> >> "beep" and a complete shutdown...
> >>
> >> Peter
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
>
>
> --
> Aleksander M. Stensby
> Software Developer
> Integrasco A/S
> aleksander.stensby@integrasco.no
> Tlf.: +47 41 22 82 72
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message