lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Klaas <mike.kl...@gmail.com>
Subject Re: Indexing
Date Fri, 24 Aug 2007 23:17:54 GMT
Note that Solr is expressedly designed for this kind of thing:  every  
time you commit, a new searcher is opened in the background, warmed,  
and the swapped with the current one.  It also support autocommit  
after X updates, or after the oldest update passes X milliseconds  
without being committed.

-Mike

On 22-Aug-07, at 7:39 AM, Jonathan Ariel wrote:

> I'm not reindexing the entire index. I'm just commiting the  
> updated. But I'm
> not sure how it would affect performance to commit in real time. I  
> think
> right now I have like 10 updated per minute.
>
> On 8/22/07, Erick Erickson <erickerickson@gmail.com> wrote:
>>
>> There are several approaches. First, is your index small
>> enough to fit in RAM? You might consider just putting it all in
>> RAM and searching that.
>>
>> A more complex solution would be to keep the increments
>> in a separate RAMDir AND your FSDir, search both and
>> keep things coordinated. Something like
>>
>> open FSDIr
>> create RAMDir
>> while (whatever) {
>>    get request
>>    if (modification) {
>>        write to FSDir and RAMDir
>>   }
>>    if (search) {
>>      search FSDir
>>      open RAMDir reader
>>      search RAMDir
>>      close RAMDir reader (but not writer!)
>>   }
>> }
>>
>> close FSDIr
>> close RAMDir
>> start again from the top.
>>
>>
>>
>> Warning: I haven't done this, but it *should* work. The sticky
>> part seems to me to be coordinating deletes since the
>> open FSDir may contain documents also in the RAMDir,
>> but that's "an exercise for the reader"<G>,
>>
>> You could also define the problem away and just live
>> with a 5 minute latency.
>>
>> Best
>> Erick
>>
>> On 8/22/07, Jonathan Ariel <ionathan@gmail.com> wrote:
>>>
>>> Hi,
>>> I'm new to this list. So first of all Hello to everyone!
>>>
>>> So right now I have a little issue I would like to discuss with you.
>>> Suppose that your are in a really big application where the data  
>>> in your
>>> database is updated really fast. I reindex lucene every 5 min but  
>>> since
>> my
>>> application lists everything from lucene there are like 5 minutes  
>>> (in
>> the
>>> worse case) where I don't see new staff.
>>> What do you think would be the best aproach to this problem?
>>>
>>> Thanks!
>>>
>>> Jonathan
>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message