lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: thread safe shared IndexSearcher
Date Thu, 20 Sep 2007 15:56:57 GMT
Good luck Jay. Keep in mind, pretty much all LuceneIndexAccessor does is 
sync Readers with Writers and allow multiple threads to share the same 
instances of them -- nothing more. The code just forces Readers to 
refresh when Writers are used to change the index. There really isn't 
any functionality beyond that offered. Since you want to have a multi 
thread system access the same resources (which occasionally need to be 
refreshed) its not too easy to get around a synchronized block.

If I am able to extract some usable code for you soon I will let you know.

- Mark

Jay Yu wrote:
> Mark,
>
> Thanks for sharing your valuable exp. and thoughts.
> Frankly our system already has most of the functionalities 
> LuceneIndexAcessor offers. The only thing I am looking for is to sync 
> the searchers' close. That's why I am little worried about the way 
> accessor handles the searcher sync.
> I will probably give it a try to see how it performs in our system.
>
> Thanks!
>
> Jay
>
> Mark Miller wrote:
>> The method is synched, but this is because each thread *does* share 
>> the same Searcher. To maintain a cache of searchers across multiple 
>> threads, you've got to sync -- to reference count, you've got to 
>> sync. The performance hit of LuceneIndexAcessor is pretty minimal for 
>> its functionality, and frankly, for the functionality you want, you 
>> have to pay a cost. Thats not even the end of it really...your going 
>> to need to maintain a cache of Accessor objects for each index as 
>> well...and if you dont know all the indexes at startup time, access 
>> to this will also need to be synched. I wouldn't worry though -- 
>> searches are still lightening fast...that won't be the bottleneck. 
>> I'll work on getting you some code, but if your worried, try some 
>> benchmarking on the original code.
>>
>> Also, to be clear, I don't have the code in front of me, but getting 
>> a Searcher does not require waiting for a Writer to be released. 
>> Searchers are cached and resused (and instantly available) until a 
>> Writer is released. When this happens, the release Writer method 
>> waits for all the Searchers to return (happens pretty quick as 
>> searches are pretty quick), the Searcher cache is cleared, and then 
>> subsequent calls to getSearcher create new Searchers that can see 
>> what the Writer added.
>>
>> The key is use your Writer/Searcher/Reader quickly and then release 
>> it (unless your bulk loading). I've had such a system with 5+ million 
>> docs on a standard machine and searches where still well below a 
>> second after the first Searcher is cached (and even the first search 
>> is darn quick). And that includes a lot of extra crap I am doing.
>>
>> - Mark
>>
>> Jay Yu wrote:
>>> Mark,
>>>
>>> After reading the implementation of LuceneIndexAccessor.getSearcher(),
>>> I realized that the method is synchronized and wait for 
>>> writingDirector to be released. That means if we getSearcher for 
>>> each query in each thread, there might be a contention and 
>>> performance hit. In fact, even the method of release(searcher) is 
>>> costly. On the other hand, if multiple threads share share one 
>>> searcher then it'd defeat the
>>> purpose of using LuceneIndexAccessor.
>>> Do I miss sth here? What's your suggested use case for 
>>> LuceneIndexAccessor?
>>>
>>> Thanks!
>>>
>>> Jay
>>> Mark Miller wrote:
>>>> Ill respond a point at a time:
>>>>
>>>> 1.
>>>>
>>>> ****************************** Hi Maik,
>>>>
>>>> So what happens in this case:
>>>>
>>>> IndexAccessProvider accessProvider = new 
>>>> IndexAccessProvider(directory,
>>>>
>>>> analyzer);
>>>>
>>>> LuceneIndexAccessor accessor = new 
>>>> LuceneIndexAccessor(accessProvider);
>>>>
>>>> accessor.open();
>>>>
>>>> IndexWriter writer = accessor.getWriter();
>>>>
>>>> // reference to the same instance?
>>>>
>>>> IndexWriter writer2 = accessor.getWriter();
>>>>
>>>> writer.addDocument(....);
>>>>
>>>> writer2.addDocument(....);
>>>>
>>>>
>>>>
>>>> // I didn't release the writer yet
>>>>
>>>> // will this block?
>>>>
>>>> IndexReader reader = accessor.getReader();
>>>>
>>>> reader.delete(....);
>>>>
>>>> ************
>>>>
>>>> This is not really an issue. First, if you are going to delete with 
>>>> a Reader
>>>> you need to call getWritingReader and not getReader. When you do 
>>>> that, the
>>>> getWritingReader call will block until writer and writer2 are 
>>>> released. If
>>>> you are just adding a couple docs before releasing the writers, 
>>>> this is no
>>>> problem because the block will be very short. If you are loading 
>>>> tons of
>>>> docs and you want to be able to delete with a Reader in a timely 
>>>> manner, you
>>>> should release the writers every now and then (release and re-get 
>>>> the Writer
>>>> every 100 docs or something). An interactive index should not hog the
>>>> Writer, while something that is just loading a lot could hog the 
>>>> Writer.
>>>> This is no different than normal…you cannot delete with a Reader while
>>>> adding with a Writer with Lucene. This code just enforces those 
>>>> semantics.
>>>> The best solution is to just use a Writer to delete – I never get a
>>>> ReadingWriter.
>>>>
>>>> 2. http://issues.apache.org/bugzilla/show_bug.cgi?id=34995#c3
>>>>
>>>> This is no big deal either. I just added another getWriter call 
>>>> that takes a
>>>> create Boolean.
>>>>
>>>> 3. I don't think there is a latest release. This has never gotten much
>>>> official attention and is not in the sandbox. I worked straight 
>>>> from the
>>>> originally submitted code.
>>>>
>>>> 4. I will look into getting together some code that I can share. The
>>>> multisearcher changes that are need are a couple of one liners 
>>>> really, so at
>>>> a minimum I will give you the changes needed.
>>>>
>>>>
>>>>
>>>> -       Mark
>>>>
>>>>
>>>>
>>>> On 9/19/07, Jay Yu <yu@ai.sri.com> wrote:
>>>>
>>>> Mark,
>>>>
>>>>
>>>>
>>>> thanks for sharing your insight and experience about 
>>>> LuceneIndexAccessor!
>>>>
>>>> I remember seeing some people reporting some issues about it, such as:
>>>>
>>>> http://www.archivum.info/java-dev@lucene.apache.org/2005-05/msg00114.html

>>>>
>>>>
>>>> http://issues.apache.org/bugzilla/show_bug.cgi?id=34995#c3
>>>>
>>>>
>>>>
>>>> Have those issues been resolved?
>>>>
>>>>
>>>>
>>>> Where did you get the latest release? It is not in the official Lucene
>>>>
>>>> sandbox/contrib.
>>>>
>>>>
>>>>
>>>> Finally, are you willing to share your extended version to include 
>>>> your
>>>>
>>>> tweak relating to the MultiSearcher?
>>>>
>>>>
>>>>
>>>> Thanks a lot!
>>>>
>>>>
>>>>
>>>> Jay
>>>>
>>>>
>>>>
>>>> Mark Miller wrote:
>>>>
>>>>> I use option 3 extensivley and find it very effective. There is a 
>>>>> tweak or
>>>>
>>>>> two required to get it to work right with MultiSearchers, but 
>>>>> other than
>>>>
>>>>> that, the code is great. I have built a lot on top of it. I'm on 
>>>>> the list
>>>>
>>>>> all the time and would be happy to answer any questions you have in
>>>> regards
>>>>
>>>>> to LuceneIndexAccessor. Frankly, I think its overlooked far too much.
>>>>
>>>>
>>>>> - Mark
>>>>
>>>>
>>>>
>>>>> On 9/19/07, Jay Yu <yu@ai.sri.com> wrote:
>>>>
>>>>
>>>>>> In a multithread app like web app, a shared IndexSearcher could 
>>>>>> throw a
>>>>
>>>>>> AlreadyClosedException when another thread is trying to update the
>>>>
>>>>>> underlying IndexReader by closing the shared searcher after the 
>>>>>> index is
>>>>
>>>>>> updated. Searching over the past discussions on this mailing list,
I
>>>>
>>>>>> found several approaches to solve the problem.
>>>>
>>>>>> 1. use solr
>>>>
>>>>>> 2. use DelayCloseIndexSearcher
>>>>
>>>>>> 3. use LuceneIndexAccessor
>>>>
>>>>
>>>>
>>>>>> the first one is not feasible for us; some people seemed to have
>>>>
>>>>>> problems with No. 2 and I do not find a lot of discussions around

>>>>>> No.3.
>>>>
>>>>
>>>>>> I wonder if anyone has good experience on No 2 and 3?
>>>>
>>>>>> Or do I miss other better solutions?
>>>>
>>>>
>>>>>> Thanks for any suggestion/comment!
>>>>
>>>>
>>>>>> Jay
>>>>
>>>>
>>>>>> ---------------------------------------------------------------------

>>>>>>
>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message