lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject RE: Swapping between indexes
Date Fri, 07 Mar 2008 14:17:29 GMT
On Thu, 2008-03-06 at 18:40 +0100, spring@gmx.eu wrote:
> > >  With a commit after every add: 30 min.
> > >  With a commit after 100 add: 23 min.
> > >  Only one commit: 20 min.

[...]

> I think it is a real world scenario because one has always the read the docs
> from somewhere and offen has to store the index state somewhere else.

Very true, but the time it takes to create the documents varies greatly
between systems.

I tried repeating your test by creating a simple 14 MB index with 10,000
documents on my desktop-machine. each document was made up of

 - one non-tokenized unique stored indexed field
 - one non-tokenized indexed stored field with one of 9 terms
 - one tokenized field with 930 random characters, including space

With a commit after every add: 4 min, 46 sec.
With a commit after every 100 add: 12 sec.
Only one commit: 8 sec.


Guesstimating the amortized time spend on adding each document on such a
small corpus, by blatantly ignoring the overhead of creating the
documents, gives us the following:

With a commit after every add: (286 sec / 10,000 docs) 28.6 ms.
With a commit after every 100 add: (12 sec / 10,000 docs) 1.2 ms.
Only one commit: (8 sec / 10,000 docs) 0.8 ms.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message