lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-1313) Realtime Search
Date Tue, 28 Apr 2009 15:47:30 GMT


Michael McCandless commented on LUCENE-1313:

Yonik raised a good question on LUCENE-1618, which is what gains do we really expect to see
by using RAMDir for the tiny recently flushed segments?

It would be nice if we could approximately measure this before putting more work into this
issue -- if the gains are not "decent" this optimization may not be worthwhile.

Of course, we are talking about 100s of milliseconds for the turnaround time to add docs &
open an NRT reader, so if the time for writing/opening many tiny files in RAMDir vs FSDir
 differs by say 10s of msecs then we should pursue this.  We should also consider that the
IO system may very well be quite busy (doing merge(s), backups, etc.) and that'd make it slower
to have to create tiny files.

A simpler optimization might be to allow using CFS for tiny files (even when CFS is turned
off), but built the CFS in RAM (ie, write tiny files first to RAMFiles, then make the CFS
file on disk).  That might get most of the gains since the FSDir sees only one file created
per tiny segment, not N.

> Realtime Search
> ---------------
>                 Key: LUCENE-1313
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 2.9
>         Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch,
LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch
> Realtime search with transactional semantics.  
> Possible future directions:
>   * Optimistic concurrency
>   * Replication
> Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication.
 It is difficult to replicate using other methods because while the document may easily be
serialized, the analyzer cannot.
> I think this issue can hold realtime benchmarks which include indexing and searching

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message