lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <>
Subject Post mortem kudos for (LUCENE-843) :)
Date Thu, 12 Jul 2007 17:17:18 GMT
Let the numbers speak, 

INDEX SIZE: 58Mio docs, 2.5G on disk
- two tokenized Fields, both with average 4 tokens (rather small), approx. 2Mio unique tokens
- one binary stored field (one VInt)
- HW commodity AMD PC, 2.8Ghz (or so) 2G RAM, single disk, WIN XP 64bit, jvm 6.0 32bit

before LUCENE-843 indexing speed was 5-6k records per second (and I believed this was already
as fast as it gets)
after (trunk version yesterday) 60-65k documents per second! 
All (exhaustive!) tests pass on this index.

autocommit = false, 24M RAMBuffer, using char[] instead of String for Token (this was the
reason we separated Analysis in two phases, leaving for Lucene Analyzer only simple whitespace

Brilliant work, nothing more and nothing less! 

Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for
your free account today*

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message