lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1313) Realtime Search
Date Mon, 27 Apr 2009 20:58:31 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703366#action_12703366
] 

Michael McCandless commented on LUCENE-1313:
--------------------------------------------

{quote}
Agreed that DW can write the segment to the RAMDir. I started
coding along these lines however what do we do about the RAMDir
merging? This is why I was thinking we'll need a separate IW?
Otherwise the ram segments (if they are treated the same as disk
segments) would quickly be merged to disk? Or we have two
separate merging paths?
{quote}

Hmm, right.  We could exclude RAMDir segments from consideration by
MergePolicy?  Alternatively, we could "expect" the MergePolicy to
recognize this and be smart about choosing merges (ie don't mix
merges)?

EG we do in fact want some merging of the RAM segments if they get too
numerous (since that will impact search performance).

{quote}
> we should make with NRT is to not close the doc store
> (stored fields, term vector) files when flushing for an NRT
> reader.

Agreed, I think this feature is a must otherwise we're doing
unnecessary in ram merging.
{quote}

OK, let's do this as a separate issue/optimization for NRT.  There are
two separate parts to it:

  * Ability to store doc stores in "real" directory (looks like you
    opened LUCENE-1618 for this part).
 
  * Ability to "share" IndexOutput & IndexInput

{quote}
I ran into problems with this before, I was trying to reuse
Directory to write a transaction log. It seemed theoretically
doable however it didn't work in practice. It could have been
the seeking and replacing but I don't remember. FSIndexOutput
uses a writeable RAF and FSIndexInput is read only why would
there be an issue?
{quote}

Hmm... seems like we need to investigate further.  We could either
"ask" an IndexOutput for its IndexInput (sharing the underlying RAF),
or try to separately open an IndexInput (which may not work on
Windows).

{quote}
To implement this functionality in parallel (and perhaps make
the overall patch cleaner), writing doc stores directly to a
separate directory can be a different patch? There can be an
option IW.setDocStoresDirectory(Directory) that the patch
implements? Then some unit tests that are separate from the near
realtime portion.
{quote}

Yes, separate issue (LUCENE-1618).


> Realtime Search
> ---------------
>
>                 Key: LUCENE-1313
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1313
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch,
LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch
>
>
> Realtime search with transactional semantics.  
> Possible future directions:
>   * Optimistic concurrency
>   * Replication
> Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication.
 It is difficult to replicate using other methods because while the document may easily be
serialized, the analyzer cannot.
> I think this issue can hold realtime benchmarks which include indexing and searching
concurrently.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message