lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Yu ...@AI.SRI.COM>
Subject Re: thread safe shared IndexSearcher
Date Tue, 25 Sep 2007 17:47:13 GMT
Mark,

Looking at your implementation of the DefaultIndexAccessor regarding the 
writer, I think there could be a problem: you have only one cached 
writer but the getWriter(boolean, boolean) allows 2 booleans, so 
ideally, you need 4 cached writer. Otherwise if one starts with a writer 
that over writes the existing index, then later he cannot append docs to 
the index.
Do I miss sth here or you have not finished the implementation of 
getWriter yet?

Thanks!

Jay

Mark Miller wrote:
> Ah, thanks for catching that. One of the pieces I did not finish...the 
> keyword analyzer was placeholder code.
> 
> I will take your comments into account and update the code.
> 
> I have some other pieces to polish as well. Previously, I extended and 
> built upon the original code, but I can't give it away, so this is my 
> attempt at something lessor, but cleaner.
> 
> Jay Yu wrote:
>> Thanks for the tip.
>> One small improvement on the IndexAccessorFactory might be to allow 
>> user to specify the Analyzer instead of using a default 
>> KeywordAnalyzer, which of course will make your static init of the 
>> cached accessors difficult unless you add more interfaces to the 
>> accessor to allow reset analyzer/Dir as in my own version.
>>
>>
>>
>>
>> Jay
>>
>> Mark Miller wrote:
>>> One final note....if you are using the IndexAccessor and you are only 
>>> accessing the index from one JVM, you can use the NoLockFactory and 
>>> save some sync cost there.
>>>
>>> Jay Yu wrote:
>>>> Mark,
>>>>
>>>> Great effort getting the original lucene index accessor package in 
>>>> this shape. I am sure this will benefit a lot of people using Lucene 
>>>> in a multithread env.
>>>> I have a quick question to ask you:
>>>> Do you have to use the core Lucene 2.3-dev in order to use the 
>>>> accessor?
>>>>
>>>> I will take a look at your codes to see if I could help. I used a 
>>>> slightly modified version of the original package in my project but 
>>>> it breaks some of my tests. I hope your version works better.
>>>>
>>>> Thanks a lot!
>>>>
>>>> Jay
>>>>
>>>>
>>>> Mark Miller wrote:
>>>>> I have sat down and rewrote IndexAccessor from scratch. I copied in 
>>>>> the same reference counting logic, pruned some things, and tried to 
>>>>> make the whole package a bit simpler to use. I have a few things to 
>>>>> do, but its pretty solid already. The only major thing I'd still 
>>>>> like to do is add an option to warm searchers before putting them 
>>>>> in the Searcher cache. Id like to writer some more tests as well. 
>>>>> Any help greatly appreciated if your interested in using the thing.
>>>>>
>>>>>
>>>>> http://myhardshadow.com/indexaccessor/trunk/src/test/com/mhs/indexaccessor/SimpleSearchServer.java

>>>>>
>>>>>
>>>>> Here is a an example of a class that can be instantiated in one of 
>>>>> multiple threads and read /modify a single index without worrying 
>>>>> about what any
>>>>> of the other threads are doing to the index at any given time. This 
>>>>> is a very simple example of how to use the IndexAccessor and not 
>>>>> necessarily an
>>>>> example of best practices. The main idea is that you get your 
>>>>> Writer, Searcher, or Reader, and then be sure to release it as soon 
>>>>> as your done with it
>>>>> in a finally block. For loading, you will want to load many docs 
>>>>> with a Writer (batch them) before releasing it, but remember that 
>>>>> Readers will not get a new view
>>>>> of the index until you release all of the Writers. So beware 
>>>>> hogging a Writer unless you thats what your intending.
>>>>>
>>>>> JavaDoc:
>>>>> http://myhardshadow.com/indexaccessorapi/
>>>>>
>>>>> Code:
>>>>> http://myhardshadow.com/indexaccessor/trunk/
>>>>>
>>>>> Jar:
>>>>> http://myhardshadow.com/indexaccessorreleases/indexaccessor.jar
>>>>>
>>>>>
>>>>> Your synchronized block concerns:
>>>>>
>>>>> The synchronized blocks that control accesss to the IndexAccessor 
>>>>> do not have a huge impact on performance. Keep in mind that all of 
>>>>> the work is not done in a synchonrized block, just the retrieval of 
>>>>> the Searcher, Writer, Reader. Even if the synchronization makes the 
>>>>> method twice as expensive, it is still overpowered by the cost of 
>>>>> parsing queries and searching the index. This applies with or 
>>>>> without contention. I wrote a simple test and included the output 
>>>>> below. You might use the IBM Lock Analyzer for Java to further 
>>>>> analyze these costs. Trust me, this thing is speedy. Its many times 
>>>>> better than using IndexModifier.
>>>>>
>>>>> Without Contention
>>>>> Just retrieve and release Searcher 100000 times
>>>>> ----
>>>>> avg time:6.3E-4 ms
>>>>> total time:63 ms
>>>>>
>>>>> Parse query and search on 1 doc 100000 times
>>>>> ----
>>>>> avg time:0.03107 ms
>>>>> total time:3107 ms
>>>>>
>>>>>
>>>>> With Contention (40 other threads running 80000 searches)
>>>>> Just retrieve and release Searcher 100000 times
>>>>> ----
>>>>> avg time:0.04643 ms
>>>>> total time:4643 ms
>>>>>
>>>>> Parse query and search on 1 doc 100000 times
>>>>> ----
>>>>> avg time:0.64337 ms
>>>>> total time:64337 ms
>>>>>
>>>>>
>>>>> - Mark
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message