lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Friedman <pfried...@macromedia.com>
Subject RE: Strange behavior indexing 1000 documents in RAMDirectory
Date Mon, 12 Nov 2001 14:03:41 GMT
Doug,

How stable are the nightly builds?  When do you expect a final release of v.1.2?

Thanks.
Paul

-----Original Message-----
From: Doug Cutting [mailto:DCutting@grandcentral.com]
Sent: Friday, November 09, 2001 6:02 PM
To: 'Lucene Users List'
Subject: RE: Strange behavior indexing 1000 documents in RAMDirectory


Paul,

Thanks for the nice test case!

This bug was fixed a week or so ago.  Try the latest nightly release from:
  http://jakarta.apache.org/builds/jakarta-lucene/nightly/

Using that, I get the desired output:

Index and search 1000 documents
RAMDirectory: indexed 1000 in 1532 msec
RAMDirectory indexing: search 1000 in 0 msec
FSDirectory: indexed 1000 in 8622 msec
FSDirectory indexing: search 1000 in 0 msec

RAMDirectory is sure a lot faster!  Looks like I should add an option to let
more of indexing automatically happen in a RAMDirectory...

Currently mergeFactor documents are indexed in RAM and then merged to disk.
It would be fairly easy to add a limit so that, up to N documents could be
indexed in RAM before any are written to disk, where N is user-specified.
IndexWriter.close() would still flush RAM-based segments to disk.  The
default should still probably be fairly low, in case folks are adding large
documents and don't have much RAM, but folks with RAM and small documents
could raise it.

A better approach would be to have users specify the limit in bytes rather
than documents, and to flush the RAM-based segments when the RAM directory's
size reaches that limit.  This would take a bit more work, but still
shouldn't be hard.  Then you could dedicate, say, 10MB to indexing,
regardless of document size.  Hmm...

Doug

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message