Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (athena.apache.org: local policy)
Message-ID: <46F843C8.30604@ai.sri.com>
Date: Mon, 24 Sep 2007 16:10:00 -0700
From: Jay Yu <yu@AI.SRI.COM>
User-Agent: Thunderbird 2.0.0.6 (Windows/20070728)
MIME-Version: 1.0
To: java-user@lucene.apache.org
Subject: Re: thread safe shared IndexSearcher
References: <46F15AE1.5040400@ai.sri.com>
	 <ab0709bf0709191136q4263fdbdpc3a6ebd9ad9b9917@mail.gmail.com>
	 <46F171FB.2080306@ai.sri.com>
 <ab0709bf0709191244q124bb381x84298e6490bc5f92@mail.gmail.com>
 <46F1A8B5.1030401@ai.sri.com> <46F1D862.8000201@gmail.com>
 <46F2959D.9040403@ai.sri.com> <46F29849.4070701@gmail.com>
 <46F299A4.6030403@ai.sri.com> <46F7A320.9030100@gmail.com>
 <46F7E9F9.3010103@ai.sri.com> <46F8366F.1090806@gmail.com>
 <46F83E84.2000102@ai.sri.com> <46F8419E.7080501@gmail.com>
In-Reply-To: <46F8419E.7080501@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit

Thanks for the tip.
One small improvement on the IndexAccessorFactory might be to allow user 
to specify the Analyzer instead of using a default KeywordAnalyzer, 
which of course will make your static init of the cached accessors 
difficult unless you add more interfaces to the accessor to allow reset 
analyzer/Dir as in my own version.


Jay

Mark Miller wrote:
> One final note....if you are using the IndexAccessor and you are only 
> accessing the index from one JVM, you can use the NoLockFactory and save 
> some sync cost there.
> 
> Jay Yu wrote:
>> Mark,
>>
>> Great effort getting the original lucene index accessor package in 
>> this shape. I am sure this will benefit a lot of people using Lucene 
>> in a multithread env.
>> I have a quick question to ask you:
>> Do you have to use the core Lucene 2.3-dev in order to use the accessor?
>>
>> I will take a look at your codes to see if I could help. I used a 
>> slightly modified version of the original package in my project but it 
>> breaks some of my tests. I hope your version works better.
>>
>> Thanks a lot!
>>
>> Jay
>>
>>
>> Mark Miller wrote:
>>> I have sat down and rewrote IndexAccessor from scratch. I copied in 
>>> the same reference counting logic, pruned some things, and tried to 
>>> make the whole package a bit simpler to use. I have a few things to 
>>> do, but its pretty solid already. The only major thing I'd still like 
>>> to do is add an option to warm searchers before putting them in the 
>>> Searcher cache. Id like to writer some more tests as well. Any help 
>>> greatly appreciated if your interested in using the thing.
>>>
>>>
>>> http://myhardshadow.com/indexaccessor/trunk/src/test/com/mhs/indexaccessor/SimpleSearchServer.java 
>>>
>>>
>>> Here is a an example of a class that can be instantiated in one of 
>>> multiple threads and read /modify a single index without worrying 
>>> about what any
>>> of the other threads are doing to the index at any given time. This 
>>> is a very simple example of how to use the IndexAccessor and not 
>>> necessarily an
>>> example of best practices. The main idea is that you get your Writer, 
>>> Searcher, or Reader, and then be sure to release it as soon as your 
>>> done with it
>>> in a finally block. For loading, you will want to load many docs with 
>>> a Writer (batch them) before releasing it, but remember that Readers 
>>> will not get a new view
>>> of the index until you release all of the Writers. So beware hogging 
>>> a Writer unless you thats what your intending.
>>>
>>> JavaDoc:
>>> http://myhardshadow.com/indexaccessorapi/
>>>
>>> Code:
>>> http://myhardshadow.com/indexaccessor/trunk/
>>>
>>> Jar:
>>> http://myhardshadow.com/indexaccessorreleases/indexaccessor.jar
>>>
>>>
>>> Your synchronized block concerns:
>>>
>>> The synchronized blocks that control accesss to the IndexAccessor do 
>>> not have a huge impact on performance. Keep in mind that all of the 
>>> work is not done in a synchonrized block, just the retrieval of the 
>>> Searcher, Writer, Reader. Even if the synchronization makes the 
>>> method twice as expensive, it is still overpowered by the cost of 
>>> parsing queries and searching the index. This applies with or without 
>>> contention. I wrote a simple test and included the output below. You 
>>> might use the IBM Lock Analyzer for Java to further analyze these 
>>> costs. Trust me, this thing is speedy. Its many times better than 
>>> using IndexModifier.
>>>
>>> Without Contention
>>> Just retrieve and release Searcher 100000 times
>>> ----
>>> avg time:6.3E-4 ms
>>> total time:63 ms
>>>
>>> Parse query and search on 1 doc 100000 times
>>> ----
>>> avg time:0.03107 ms
>>> total time:3107 ms
>>>
>>>
>>> With Contention (40 other threads running 80000 searches)
>>> Just retrieve and release Searcher 100000 times
>>> ----
>>> avg time:0.04643 ms
>>> total time:4643 ms
>>>
>>> Parse query and search on 1 doc 100000 times
>>> ----
>>> avg time:0.64337 ms
>>> total time:64337 ms
>>>
>>>
>>> - Mark
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org