lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1313) Realtime Search
Date Mon, 04 May 2009 22:16:30 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705788#action_12705788
] 

Michael McCandless commented on LUCENE-1313:
--------------------------------------------

{quote}
> OK, though I'd like to simply always use FSD, even if
> primary & secondary are the same dir.

How will always using FSD work? Doesn't it assume writing to two
different directories?
{quote}

I think on creating IW the user should state (via new expert ctor)
that they intend to use it for NRT (say, a new boolean
"enableNearRealTime").

Then we could pass IFD either an FSD (when in NRT mode) or the normal
directory when not in NRT mode.  IFD would not longer have to
duplicate FSD's logic (summing the two dir's listAlls, the
getDirectoryForFile).

SegmentInfos.hasExternalSegments, and MultiSegmentReader ctor, should
be "smart" when they're passed an FSD (probably we should add
Directory.contains(Directory) method, which by default returns true if
this.equals(dir), but FSD would override to return true if the
incoming dir .equals primary & secondary).

Likewise all the switching in DW to handle two dirs should be rolled
back (eg you adde DW.fileLength(name, dir1, dir2) that's dup code with
FSD).

{quote}
One issue is the ram buffer flush doubles the ram used (because
the segment is flushed as is to the RAM dir).
{quote}

I think we must keep transient RAM usage below the specified limit, so
that limits our flushing freedom.  Ie, in the NRT case, once DW's RAM
buffer exceeds half of the allowed remaining RAM budget (ie, the limit
minus total RAM segments) then we trigger a flush to RAM and then to
the "real" dir.

Or... we could flush the new segment directly to the real dir as one
segment, and merge all prior RAM segments as a separate new segment in
the main dir, if the free RAM is large enough.

{quote}
> this ram size should be used not only for deciding when
> it's time to merge to a disk segment, but also when it's time
> for DW to flush a new segment

In the new patch this is fixed.
{quote}

I don't see where this is taken into account?  Did you mean to attach
a new patch?


> Realtime Search
> ---------------
>
>                 Key: LUCENE-1313
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1313
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch,
LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch,
lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch
>
>
> Realtime search with transactional semantics.  
> Possible future directions:
>   * Optimistic concurrency
>   * Replication
> Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication.
 It is difficult to replicate using other methods because while the document may easily be
serialized, the analyzer cannot.
> I think this issue can hold realtime benchmarks which include indexing and searching
concurrently.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message