lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-1313) Realtime Search
Date Fri, 24 Apr 2009 20:04:30 GMT


Michael McCandless commented on LUCENE-1313:

bq. I don't think it's necessary to immediately write ram segments to disk

I agree: it should be fine from IndexWriter's standpoint if some
segments live in a private RAMDir and others live in the "real" dir.

In fact, early versions of LUCENE-843 did exactly this: IW's RAM
buffer is not as efficient as a written segment, and so you can gain
some RAM efficiency by flushing first to RAM and then merging to disk.

I think we could adopt a simple criteria: you flush the new segment to
the RAM Dir if net RAM used is < maxRamBufferSizeMB.  This way no
further configuration is needed.  On auto-flush triggering you then
must take into account the RAM usage by this RAM Dir.  On commit,
these RAM segments must be migrated to the real dir (preferably by
forcing a merge, somehow).

A near realtime reader would also happily mix "real" Dir and RAMDir

This should work well I think, and should not require a separate
RAMIndex class, and won't block things when the RAM segments are
migrated to disk by CMS.

> Realtime Search
> ---------------
>                 Key: LUCENE-1313
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 2.9
>         Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch,
LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch
> Realtime search with transactional semantics.  
> Possible future directions:
>   * Optimistic concurrency
>   * Replication
> Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication.
 It is difficult to replicate using other methods because while the document may easily be
serialized, the analyzer cannot.
> I think this issue can hold realtime benchmarks which include indexing and searching

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message