lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <>
Subject [jira] Commented: (LUCENE-1313) Realtime Search
Date Wed, 08 Apr 2009 21:44:13 GMT


Jason Rutherglen commented on LUCENE-1313:

{quote} Still, it's synthetic. If you guys (LinkedIn) have a way
to fold in some realism into the test, that'd be great, if only
"our app ingests at X docs(MB)/sec and reopens the NRT reader X
times per second" to set our ballback. {quote}

The test we need to progress to is running the indexing side
endlessly while also reopening every X seconds, then
concurrently running searches. This way we can play with a bunch
of settings (mergescheduler threads, merge factors, max merge
docs, etc), use the python code to generate a dozen cases,
execute them and find out what seems optimal for our corpus.
It's a bit of work but probably the only way each Lucene user
can conclusively say they have the optimal settings needed for
their system. Usually there is a baseline QPS that is desired,
where the reopen delay may be increased to accommodate a lack of

The ram dir portion of the NRT indexing increases in speed when
more threads are allocated but those compete with search
threads, another issue to keep in mind. 

It might be good to add some default charting to

> Realtime Search
> ---------------
>                 Key: LUCENE-1313
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 2.9
>         Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, lucene-1313.patch,
lucene-1313.patch, lucene-1313.patch, lucene-1313.patch
> Realtime search with transactional semantics.  
> Possible future directions:
>   * Optimistic concurrency
>   * Replication
> Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication.
 It is difficult to replicate using other methods because while the document may easily be
serialized, the analyzer cannot.
> I think this issue can hold realtime benchmarks which include indexing and searching

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message