lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: thread safe shared IndexSearcher
Date Thu, 20 Sep 2007 02:18:10 GMT
The method is synched, but this is because each thread *does* share the 
same Searcher. To maintain a cache of searchers across multiple threads, 
you've got to sync -- to reference count, you've got to sync. The 
performance hit of LuceneIndexAcessor is pretty minimal for its 
functionality, and frankly, for the functionality you want, you have to 
pay a cost. Thats not even the end of it really...your going to need to 
maintain a cache of Accessor objects for each index as well...and if you 
dont know all the indexes at startup time, access to this will also need 
to be synched. I wouldn't worry though -- searches are still lightening 
fast...that won't be the bottleneck. I'll work on getting you some code, 
but if your worried, try some benchmarking on the original code.

Also, to be clear, I don't have the code in front of me, but getting a 
Searcher does not require waiting for a Writer to be released. Searchers 
are cached and resused (and instantly available) until a Writer is 
released. When this happens, the release Writer method waits for all the 
Searchers to return (happens pretty quick as searches are pretty quick), 
the Searcher cache is cleared, and then subsequent calls to getSearcher 
create new Searchers that can see what the Writer added.

The key is use your Writer/Searcher/Reader quickly and then release it 
(unless your bulk loading). I've had such a system with 5+ million docs 
on a standard machine and searches where still well below a second after 
the first Searcher is cached (and even the first search is darn quick). 
And that includes a lot of extra crap I am doing.

- Mark

Jay Yu wrote:
> Mark,
>
> After reading the implementation of LuceneIndexAccessor.getSearcher(),
> I realized that the method is synchronized and wait for 
> writingDirector to be released. That means if we getSearcher for each 
> query in each thread, there might be a contention and performance hit. 
> In fact, even the method of release(searcher) is costly. On the other 
> hand, if multiple threads share share one searcher then it'd defeat the
> purpose of using LuceneIndexAccessor.
> Do I miss sth here? What's your suggested use case for 
> LuceneIndexAccessor?
>
> Thanks!
>
> Jay
> Mark Miller wrote:
>> Ill respond a point at a time:
>>
>> 1.
>>
>> ****************************** Hi Maik,
>>
>> So what happens in this case:
>>
>> IndexAccessProvider accessProvider = new IndexAccessProvider(directory,
>>
>> analyzer);
>>
>> LuceneIndexAccessor accessor = new LuceneIndexAccessor(accessProvider);
>>
>> accessor.open();
>>
>> IndexWriter writer = accessor.getWriter();
>>
>> // reference to the same instance?
>>
>> IndexWriter writer2 = accessor.getWriter();
>>
>> writer.addDocument(....);
>>
>> writer2.addDocument(....);
>>
>>
>>
>> // I didn't release the writer yet
>>
>> // will this block?
>>
>> IndexReader reader = accessor.getReader();
>>
>> reader.delete(....);
>>
>> ************
>>
>> This is not really an issue. First, if you are going to delete with a 
>> Reader
>> you need to call getWritingReader and not getReader. When you do 
>> that, the
>> getWritingReader call will block until writer and writer2 are 
>> released. If
>> you are just adding a couple docs before releasing the writers, this 
>> is no
>> problem because the block will be very short. If you are loading tons of
>> docs and you want to be able to delete with a Reader in a timely 
>> manner, you
>> should release the writers every now and then (release and re-get the 
>> Writer
>> every 100 docs or something). An interactive index should not hog the
>> Writer, while something that is just loading a lot could hog the Writer.
>> This is no different than normal…you cannot delete with a Reader while
>> adding with a Writer with Lucene. This code just enforces those 
>> semantics.
>> The best solution is to just use a Writer to delete – I never get a
>> ReadingWriter.
>>
>> 2. http://issues.apache.org/bugzilla/show_bug.cgi?id=34995#c3
>>
>> This is no big deal either. I just added another getWriter call that 
>> takes a
>> create Boolean.
>>
>> 3. I don't think there is a latest release. This has never gotten much
>> official attention and is not in the sandbox. I worked straight from the
>> originally submitted code.
>>
>> 4. I will look into getting together some code that I can share. The
>> multisearcher changes that are need are a couple of one liners 
>> really, so at
>> a minimum I will give you the changes needed.
>>
>>
>>
>> -       Mark
>>
>>
>>
>> On 9/19/07, Jay Yu <yu@ai.sri.com> wrote:
>>
>> Mark,
>>
>>
>>
>> thanks for sharing your insight and experience about 
>> LuceneIndexAccessor!
>>
>> I remember seeing some people reporting some issues about it, such as:
>>
>> http://www.archivum.info/java-dev@lucene.apache.org/2005-05/msg00114.html 
>>
>>
>> http://issues.apache.org/bugzilla/show_bug.cgi?id=34995#c3
>>
>>
>>
>> Have those issues been resolved?
>>
>>
>>
>> Where did you get the latest release? It is not in the official Lucene
>>
>> sandbox/contrib.
>>
>>
>>
>> Finally, are you willing to share your extended version to include your
>>
>> tweak relating to the MultiSearcher?
>>
>>
>>
>> Thanks a lot!
>>
>>
>>
>> Jay
>>
>>
>>
>> Mark Miller wrote:
>>
>>> I use option 3 extensivley and find it very effective. There is a 
>>> tweak or
>>
>>> two required to get it to work right with MultiSearchers, but other 
>>> than
>>
>>> that, the code is great. I have built a lot on top of it. I'm on the 
>>> list
>>
>>> all the time and would be happy to answer any questions you have in
>> regards
>>
>>> to LuceneIndexAccessor. Frankly, I think its overlooked far too much.
>>
>>
>>> - Mark
>>
>>
>>
>>> On 9/19/07, Jay Yu <yu@ai.sri.com> wrote:
>>
>>
>>>> In a multithread app like web app, a shared IndexSearcher could 
>>>> throw a
>>
>>>> AlreadyClosedException when another thread is trying to update the
>>
>>>> underlying IndexReader by closing the shared searcher after the 
>>>> index is
>>
>>>> updated. Searching over the past discussions on this mailing list, I
>>
>>>> found several approaches to solve the problem.
>>
>>>> 1. use solr
>>
>>>> 2. use DelayCloseIndexSearcher
>>
>>>> 3. use LuceneIndexAccessor
>>
>>
>>
>>>> the first one is not feasible for us; some people seemed to have
>>
>>>> problems with No. 2 and I do not find a lot of discussions around 
>>>> No.3.
>>
>>
>>>> I wonder if anyone has good experience on No 2 and 3?
>>
>>>> Or do I miss other better solutions?
>>
>>
>>>> Thanks for any suggestion/comment!
>>
>>
>>>> Jay
>>
>>
>>>> ---------------------------------------------------------------------
>>
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>>
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message