Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 98173 invoked from network); 13 Sep 2007 14:02:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Sep 2007 14:02:37 -0000 Received: (qmail 42089 invoked by uid 500); 13 Sep 2007 14:02:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 42050 invoked by uid 500); 13 Sep 2007 14:02:23 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 42039 invoked by uid 99); 13 Sep 2007 14:02:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Sep 2007 07:02:23 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 209.85.128.190 as permitted sender) Received: from [209.85.128.190] (HELO fk-out-0910.google.com) (209.85.128.190) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Sep 2007 14:02:18 +0000 Received: by fk-out-0910.google.com with SMTP id z23so484314fkz for ; Thu, 13 Sep 2007 07:01:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=2J2GMNQ2nIU4xFB6Qyql/oAOYnaeIdLK5/S7YDYOuCA=; b=KG/fgm88iaz7sREEj8pKmmFO50LB5riMIoCgR6UCCsej3Vtdg39wtCwr/WN5uiR6/jxtStFuXOf8wSZdcso/+0LjoeyEt4ZKHBASxBzuJXobvzxOey6OagV0I+VfeT6XU5BcdC2aizYvqpirs7Nvno1ov0/gV7Xo0nx0KcTCH64= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=R9cXczRN66tqA7vmpYNTn41u70H9C5RICtWNtRj9qwbuBCJCYp0LYzL4XS79ooEzy2EudJi7JBIJR6yzf29X7jXOOfPeDkovJbzQeZ7Glp9NMjd5Lfd/x8wX+DoAIGvG3XpLXU9UtZyuavNGuqV/3Ip9/NWc+zdToFuY5l/cqSI= Received: by 10.82.186.5 with SMTP id j5mr946816buf.1189692116404; Thu, 13 Sep 2007 07:01:56 -0700 (PDT) Received: by 10.82.190.14 with HTTP; Thu, 13 Sep 2007 07:01:56 -0700 (PDT) Message-ID: <359a92830709130701m6a67d299k7f43936ee55c86f5@mail.gmail.com> Date: Thu, 13 Sep 2007 10:01:56 -0400 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: regarding FieldSelector In-Reply-To: <34b8543c0709130150o2bf746beg995cfb00a6a97f68@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_20987_7506825.1189692116400" References: <34b8543c0709120213m619a117aj1f988334890a4cd9@mail.gmail.com> <6495BFAA-0337-4830-B672-AFCEFDD3A9F9@apache.org> <34b8543c0709120440h4b5f1ecau397ed25d16b35a74@mail.gmail.com> <359a92830709120653y162100fdv97aa5585350781d7@mail.gmail.com> <34b8543c0709130150o2bf746beg995cfb00a6a97f68@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_20987_7506825.1189692116400 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Do you have any evidence that you're having a performance issue? If not, I'd just do the simple thing and ignore the rest. The performance issues I found were because I was spinning through many, many documents. If you're only worrying about one document at a time, it may not be an issue. If you *are* having performance issues, I'd *strongly* recommend that you measure to find out where the problem is before trying a solution. Otherwise you'll optimize code that isn't the problem. Best Erick On 9/13/07, Mohammad Norouzi wrote: > > Thanks > as I saw the documents, we can only use this great field selector in > IndexReader.document() method the problem is I have a Searcher in my > result > set structure and when the client calls getString("a_field_name") at that > time I invoke the searcher.doc(current_doc_id).get("a_field_name), > I already collected the result IDs. so in my case, I can't use > FieldSelector. > > Do I have to revise the way of retrieving documents in my code? > > > > On 9/12/07, Erick Erickson wrote: > > > > Well, it depends on what "improve the search process" means > > in your context .. > > > > But I had a case similar to yours that I wrote up in the Wiki where > > my search times improved about 10X by using lazy loading. You > > might want to read that entry here... > > > > http://wiki.apache.org/lucene-java/FieldSelectorPerformance > > > > Note the peculiar characteristics of my data set, I really suspect > > that a 10x improvement in retrieval speed is atypical... > > > > As for when lazily-loaded fields actually get loaded, I didn't really > > have to explore it very fully, but a short experiment should do it > > for you..... > > > > Best > > Erick > > > > On 9/12/07, Mohammad Norouzi wrote: > > > > > > Hi Grant, > > > Really thanks for your nice document about advanced Lucene. it was > very > > > useful for me. > > > > > > as I understand, we can set some large fields to be lazily loading, > now > > my > > > question is when it will be loaded? it make sense when we call > > > doc.get("field_name") > > > it will load from the index, Am I right? > > > > > > in my application, I've provided a result set structure to navigate > > > between > > > results and documents and provide a get(String fieldname) method just > > like > > > java.sql.ResultSet.getString() method, and also this result set > > implements > > > HitCollector in order to collect my own ID rather than Lucene's > document > > > id, > > > so I think I can set my field ID to be loaded always and the other > > fields > > > to > > > be lazily loading, Does this improve the search process? > > > > > > again, thank you very much indeed. > > > > > > > > > On 9/12/07, Grant Ingersoll wrote: > > > > > > > > Hi Mohammad, > > > > > > > > The typical use cases are: > > > > 1. You have several small fields used in a results display and one > or > > > > two large fields (i.e. the original document) and you don't want to > > > > pay the cost of loading the large fields for results display because > > > > most of them won't be chosen. When a result is chosen, the lazily > > > > loaded field will be retrieved. > > > > > > > > 2. You only want to load certain fields, or the first field, or you > > > > just want to know the size of a field. > > > > > > > > Basically, it gives you control over how fields are loaded from disk > > > > in Lucene. > > > > > > > > See my ApacheCon Europe presentation http://cnlp.org/presentations/ > > > > slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on > > > > FieldSelector. > > > > > > > > On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote: > > > > > > > > > Hi all, > > > > > > > > > > Can anyone explain what is the FieldSelector and the usage or > > > > > benefits of > > > > > this structure? I read the javadocs but I can't get for what goal > > > > > it is > > > > > provided in Lucene. > > > > > > > > > > Thanks in advance > > > > > > > > > > -- > > > > > Regards, > > > > > Mohammad > > > > > -------------------------- > > > > > see my blog: http://brainable.blogspot.com/ > > > > > another in Persian: http://fekre-motefavet.blogspot.com/ > > > > > > > > -------------------------- > > > > Grant Ingersoll > > > > http://lucene.grantingersoll.com > > > > > > > > Lucene Helpful Hints: > > > > http://wiki.apache.org/lucene-java/BasicsOfPerformance > > > > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > > > > > > > > > > -- > > > Regards, > > > Mohammad > > > -------------------------- > > > see my blog: http://brainable.blogspot.com/ > > > another in Persian: http://fekre-motefavet.blogspot.com/ > > > > > > > > > -- > Regards, > Mohammad > -------------------------- > see my blog: http://brainable.blogspot.com/ > another in Persian: http://fekre-motefavet.blogspot.com/ > ------=_Part_20987_7506825.1189692116400--