Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 34165 invoked from network); 24 Sep 2007 23:16:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 Sep 2007 23:16:56 -0000 Received: (qmail 82712 invoked by uid 500); 24 Sep 2007 23:16:41 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 82674 invoked by uid 500); 24 Sep 2007 23:16:41 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 82663 invoked by uid 99); 24 Sep 2007 23:16:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Sep 2007 16:16:41 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of markrmiller@gmail.com designates 64.233.166.176 as permitted sender) Received: from [64.233.166.176] (HELO py-out-1112.google.com) (64.233.166.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Sep 2007 23:16:40 +0000 Received: by py-out-1112.google.com with SMTP id d32so5922257pye for ; Mon, 24 Sep 2007 16:16:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:user-agent:mime-version:to:subject:references:in-reply-to:content-type:content-transfer-encoding; bh=vQENq4GWWDMCDTfayiUtGDsO1lOgXuCcK6nNZov12ko=; b=UurXxoZEUaVoqf+Jjhi83qHKX0a38Zbrqtd4VnmMlAqksfT+qa+TbN85NIlsKvmfhpgqgjw6Urrm21o5NQSfEj8Cpucha7Ogo6qYvQ6hG8tpOY82KDHRTo8mNy27TEzPZNx5S2VW73sktGEyU8+AjWJV6Ri6AP8/GMmiTmaWI9c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:user-agent:mime-version:to:subject:references:in-reply-to:content-type:content-transfer-encoding; b=JbEfOFEGjYd9EH2ecpITPShEFX4TI3BiOAk+g3MA4V6na1zBpPkKcWGZYfU0Gt4P/Km3AS1ywcNnGVJOp2LESZD9yKu29iiOQsR0zIVNewot6hXxqhWR/JVl6u8CJgcVeEJ8vfTpWEJLH9eXx2ARorWxGrza8YD6nxnAuzqWgCU= Received: by 10.65.250.11 with SMTP id c11mr2252416qbs.1190675776798; Mon, 24 Sep 2007 16:16:16 -0700 (PDT) Received: from ?192.168.1.108? ( [69.124.234.183]) by mx.google.com with ESMTPS id e16sm2502175qba.2007.09.24.16.16.15 (version=SSLv3 cipher=RC4-MD5); Mon, 24 Sep 2007 16:16:15 -0700 (PDT) Message-ID: <46F844EC.9060904@gmail.com> Date: Mon, 24 Sep 2007 19:14:52 -0400 From: Mark Miller User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: thread safe shared IndexSearcher References: <46F15AE1.5040400@ai.sri.com> <46F171FB.2080306@ai.sri.com> <46F1A8B5.1030401@ai.sri.com> <46F1D862.8000201@gmail.com> <46F2959D.9040403@ai.sri.com> <46F29849.4070701@gmail.com> <46F299A4.6030403@ai.sri.com> <46F7A320.9030100@gmail.com> <46F7E9F9.3010103@ai.sri.com> <46F8366F.1090806@gmail.com> <46F83E84.2000102@ai.sri.com> <46F8419E.7080501@gmail.com> <46F843C8.30604@ai.sri.com> In-Reply-To: <46F843C8.30604@ai.sri.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Ah, thanks for catching that. One of the pieces I did not finish...the keyword analyzer was placeholder code. I will take your comments into account and update the code. I have some other pieces to polish as well. Previously, I extended and built upon the original code, but I can't give it away, so this is my attempt at something lessor, but cleaner. Jay Yu wrote: > Thanks for the tip. > One small improvement on the IndexAccessorFactory might be to allow > user to specify the Analyzer instead of using a default > KeywordAnalyzer, which of course will make your static init of the > cached accessors difficult unless you add more interfaces to the > accessor to allow reset analyzer/Dir as in my own version. > > > > > Jay > > Mark Miller wrote: >> One final note....if you are using the IndexAccessor and you are only >> accessing the index from one JVM, you can use the NoLockFactory and >> save some sync cost there. >> >> Jay Yu wrote: >>> Mark, >>> >>> Great effort getting the original lucene index accessor package in >>> this shape. I am sure this will benefit a lot of people using Lucene >>> in a multithread env. >>> I have a quick question to ask you: >>> Do you have to use the core Lucene 2.3-dev in order to use the >>> accessor? >>> >>> I will take a look at your codes to see if I could help. I used a >>> slightly modified version of the original package in my project but >>> it breaks some of my tests. I hope your version works better. >>> >>> Thanks a lot! >>> >>> Jay >>> >>> >>> Mark Miller wrote: >>>> I have sat down and rewrote IndexAccessor from scratch. I copied in >>>> the same reference counting logic, pruned some things, and tried to >>>> make the whole package a bit simpler to use. I have a few things to >>>> do, but its pretty solid already. The only major thing I'd still >>>> like to do is add an option to warm searchers before putting them >>>> in the Searcher cache. Id like to writer some more tests as well. >>>> Any help greatly appreciated if your interested in using the thing. >>>> >>>> >>>> http://myhardshadow.com/indexaccessor/trunk/src/test/com/mhs/indexaccessor/SimpleSearchServer.java >>>> >>>> >>>> Here is a an example of a class that can be instantiated in one of >>>> multiple threads and read /modify a single index without worrying >>>> about what any >>>> of the other threads are doing to the index at any given time. This >>>> is a very simple example of how to use the IndexAccessor and not >>>> necessarily an >>>> example of best practices. The main idea is that you get your >>>> Writer, Searcher, or Reader, and then be sure to release it as soon >>>> as your done with it >>>> in a finally block. For loading, you will want to load many docs >>>> with a Writer (batch them) before releasing it, but remember that >>>> Readers will not get a new view >>>> of the index until you release all of the Writers. So beware >>>> hogging a Writer unless you thats what your intending. >>>> >>>> JavaDoc: >>>> http://myhardshadow.com/indexaccessorapi/ >>>> >>>> Code: >>>> http://myhardshadow.com/indexaccessor/trunk/ >>>> >>>> Jar: >>>> http://myhardshadow.com/indexaccessorreleases/indexaccessor.jar >>>> >>>> >>>> Your synchronized block concerns: >>>> >>>> The synchronized blocks that control accesss to the IndexAccessor >>>> do not have a huge impact on performance. Keep in mind that all of >>>> the work is not done in a synchonrized block, just the retrieval of >>>> the Searcher, Writer, Reader. Even if the synchronization makes the >>>> method twice as expensive, it is still overpowered by the cost of >>>> parsing queries and searching the index. This applies with or >>>> without contention. I wrote a simple test and included the output >>>> below. You might use the IBM Lock Analyzer for Java to further >>>> analyze these costs. Trust me, this thing is speedy. Its many times >>>> better than using IndexModifier. >>>> >>>> Without Contention >>>> Just retrieve and release Searcher 100000 times >>>> ---- >>>> avg time:6.3E-4 ms >>>> total time:63 ms >>>> >>>> Parse query and search on 1 doc 100000 times >>>> ---- >>>> avg time:0.03107 ms >>>> total time:3107 ms >>>> >>>> >>>> With Contention (40 other threads running 80000 searches) >>>> Just retrieve and release Searcher 100000 times >>>> ---- >>>> avg time:0.04643 ms >>>> total time:4643 ms >>>> >>>> Parse query and search on 1 doc 100000 times >>>> ---- >>>> avg time:0.64337 ms >>>> total time:64337 ms >>>> >>>> >>>> - Mark >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org