lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Mastracci (JIRA)" <>
Subject [jira] Commented: (LUCENE-753) Use NIO positional read to avoid synchronization in FSIndexInput
Date Sat, 23 Aug 2008 22:31:44 GMT


Matthew Mastracci commented on LUCENE-753:


bq. Are you really sure you're not accidentally closing the searcher before calling Searcher.docFreqs?
Are you calling docFreqs directly from your app?

Our IndexReaders are actually managed in a shared pool (currently 8 IndexReaders, shared round-robin
style as requests come in).  We have some custom reference counting logic that's supposed
to keep the readers alive as long as somebody has them open.  As new index snapshots come
in, the IndexReaders are re-opened and reference counts ensure that any old index readers
in use are kept alive until the searchers are done with them.  I'm guessing we have an error
in our reference counting logic that just doesn't show up under MMapDirectory (as you mentioned,
close() is a no-op).

We're calling docFreqs directly from our app.  I'm guessing that it just happens to be the
most likely item to be called after we roll to a new index snapshot.

I don't have hard performance numbers right now, but we were having a hard time saturating
I/O or CPU with FSDirectory.  The locking was basically killing us.  When we switched to MMapDirectory
and turned on compound files, our performance jumped at least 2x.  The preliminary results
I'm seeing with NIOFSDirectory seem to indicate that it's slightly faster than MMapDirectory.

I'll try setting our app back to using the old FSDirectory and see if the exceptions still
occur.  I'll also try to fiddle with our unit tests to make sure we're correctly ref-counting
all of our index readers.

BTW, I ran a quick FSDirectory/MMapDirectory/NIOFSDirectory shootout.  It uses a parallel
benchmark that roughly models what our real-life benchmark is like.  I ran the benchmark once
through to warm the disk cache, then got the following.  The numbers are fairly stable across
various runs once the disk caches are warm:

FS: 33644ms
MMap: 28616ms
NIOFS: 33189ms

I'm a bit surprised at the results myself, but I've spent a bit of time tuning the indexes
to maximize concurrency.  I'll double-check that the benchmark is correctly running all of
the tests.

The benchmark effectively runs 10-20 queries in parallel at a time, then waits for all queries
to complete.  It does this end-to-end for a number of different query batches, then totals
up the time to complete each batch.

> Use NIO positional read to avoid synchronization in FSIndexInput
> ----------------------------------------------------------------
>                 Key: LUCENE-753
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Store
>            Reporter: Yonik Seeley
>            Assignee: Michael McCandless
>             Fix For: 2.4
>         Attachments:,,,,,,, FSDirectoryPool.patch, FSIndexInput.patch,
FSIndexInput.patch, LUCENE-753.patch, LUCENE-753.patch, lucene-753.patch, lucene-753.patch
> As suggested by Doug, we could use NIO pread to avoid synchronization on the underlying
> This could mitigate any MT performance drop caused by reducing the number of files in
the index format.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message