Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 8247 invoked from network); 7 Mar 2007 13:36:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Mar 2007 13:36:02 -0000 Received: (qmail 799 invoked by uid 500); 7 Mar 2007 13:36:04 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 764 invoked by uid 500); 7 Mar 2007 13:36:03 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 753 invoked by uid 99); 7 Mar 2007 13:36:03 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Mar 2007 05:36:03 -0800 X-ASF-Spam-Status: No, hits=2.3 required=10.0 tests=HTML_MESSAGE,MAILTO_TO_SPAM_ADDR,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of erickerickson@gmail.com designates 66.249.92.175 as permitted sender) Received: from [66.249.92.175] (HELO ug-out-1314.google.com) (66.249.92.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Mar 2007 05:35:53 -0800 Received: by ug-out-1314.google.com with SMTP id k40so543822ugc for ; Wed, 07 Mar 2007 05:35:32 -0800 (PST) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=bmBSFbPHNiYFvuAUWot6SaZ4y9w4ZW/ze+oW9v/a3d0Z1G0hhKxVlyoOvvqRyi+USJ5xsBvkaa0I8uGfW5hmKWtLxH3q9yKZ15g/ifAjGwJALBkR/zfSl2f34jJ+LI02Xnfas2KxYcZv2Cp6ItUUmdjAlT6Cz7Jvg79+uoG1Ajk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=QoljKSpa7yrtyQc/jANGXXRe4li9Zy+oFhFpOhXrqO3AC80HkLx2MVDrlvauAdTS0UTw4Q7jHg8iyeOEDho4tol26JHVMksb6qp74cFvFPye3W8+VqfufH2asZs1VC3SRnjDzkkCPt1LsjNiiuotmgdIht3KAJW5q+ksLF/ChMU= Received: by 10.114.94.1 with SMTP id r1mr2132345wab.1173274530865; Wed, 07 Mar 2007 05:35:30 -0800 (PST) Received: by 10.114.57.15 with HTTP; Wed, 7 Mar 2007 05:35:30 -0800 (PST) Message-ID: <359a92830703070535i2b242797p10f163c9ce657fd0@mail.gmail.com> Date: Wed, 7 Mar 2007 08:35:30 -0500 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: how to define a pool for Searcher? In-Reply-To: <34b8543c0703070439p332cda1bn11b1e5c7e2cc92e0@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_210156_21053250.1173274530753" References: <34b8543c0702220009m72f9e9c7pf8f9764953be6a48@mail.gmail.com> <45DDE5DA.80202@gmail.com> <34b8543c0703062228m7b793735w216d6e0fe2362723@mail.gmail.com> <45EEAE4B.4090805@gmail.com> <34b8543c0703070439p332cda1bn11b1e5c7e2cc92e0@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_210156_21053250.1173274530753 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline You may not be able to store all the documents, but what about just storing the document IDs in a list? And remember that a Hits object re-queries the index every 100 documents or so when you iterate through it, so if you're really using a Hits object, you're re-executing the query anyway. You might want to think about using a HitCollector, or TopDocs, or... and recording the document IDs (which is very fast, and holding 50,000 integers isn't very expensive). If you're paging through the hits using a Hits object, I believe you'll find no effective difference between keeping the searcher open and navigating through the list and re-executing the query each time. If this is true, you'll see a dramatic difference by using, say, TopDocs when you're paging into the latter parts of the list. And you'll see an *enormous* difference if you cache the doc IDs for your paging ... Of course, if you're not using a Hits object most of this doesn't count. Best Erick On 3/7/07, Mohammad Norouzi wrote: > > yes I am very concerned about this because we have a big project with many > users and I am responsible for this. the thing that preoccupied my mind is > application performance because there is more than 500 thousands records > (documents). > > a single search may returns about 50 thousand documents and it is not > possible to put all of them into a say java.util.List so I have to keep > the > searcher open and move forward or backward through the Hits and when the > user clicked on "Finish" button or the time exceeds over than a specific > time,(or the session destroyed) so I set a flag to true, then the other > session can access that searcher without closing any searcher or reader. > > any way, your comments are useful for me. > thanks > > On 3/7/07, Mark Miller wrote: > > > > To address your hits question: I wouldn't keep hits around, but would > > re-search instead. It is often more of a headache than a time savings to > > keep around all of the Hits objects and to have to manage them. I made > > my own Hits object that does no caching because of this. Pagination is > > often best done by re-querying. > > > > Also, keep in mind that you prob won't have 1000 Similarities...you will > > prob have much closer to 1 , maybe a couple if you have created a > > custom one. The biggest chance you have more than one Searcher cached > > for an Index is if you have a MultiSearcher cached that searches over > > it. Out of the box, indexAccessor does not handle MultiSearchers > > perfectly though...it does not check out a Searcher for each of the > > underlying Indexes, so you will have to do that your self...then > > remember to release them all when you release the MultiSearcher. > > > > I think in general, you are over concerned. IndexAccessor will handle > > most of this for you without much intervention on your part. > > > > - Mark > > > > Mohammad Norouzi wrote: > > > Hello Mark, > > > there is something vague for me about the Lucene-indexAccessor you > > > created > > > and my problem. > > > as I see your codes, you create IndexSearcher and put it into a Map > > > and the > > > only thing that separate them is the Similarity the have. so if say > 1000 > > > users with different Similarity connect to my application there will > > > be 1000 > > > IndexSearcher with their own internal Reader. > > > now, in my case, I have an IndexResultSet just like java.sql.ResultSet > > > which > > > it contains a Hits. so a user may go forward or backward through the > > > Hits' > > > documents and actually every user are doing this job. > > > > > > to do so, I have to find the Similarity that a user working with it > > > and find > > > the right IndexSearcher in order to support pagination for her. is > this > > > right? I mean can I trust to Similarity to find the right IndexReader > > > that a > > > user have used it before? > > > > > > another question is, how about I have one IndexReader for all my > > > IndexSearcher and manage them simultaneously to access that single > > > Reader.? > > > > > > thank you very much in advance > > > > > > > > > On 2/22/07, Mark Miller wrote: > > >> > > >> I would not do this from scratch...if you are interested in Solr go > > that > > >> route else I would build off > > >> http://issues.apache.org/jira/browse/LUCENE-390 > > >> > > >> - Mark > > >> > > >> Mohammad Norouzi wrote: > > >> > Hi all, > > >> > I am going to build a Searcher pooling. if any one has experience > on > > >> > this, I > > >> > would be glad to hear his/her recommendation and suggestion. I want > > to > > >> > know > > >> > what issues I should be apply. considering I am going to use this > on > > a > > >> > web > > >> > application with many user sessions. > > >> > > > >> > thank you very much in advance. > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >> For additional commands, e-mail: java-user-help@lucene.apache.org > > >> > > >> > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > -- > Regards, > Mohammad > ------=_Part_210156_21053250.1173274530753--