Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 647B1DCBA for ; Tue, 23 Oct 2012 22:06:26 +0000 (UTC) Received: (qmail 86838 invoked by uid 500); 23 Oct 2012 22:06:24 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 86786 invoked by uid 500); 23 Oct 2012 22:06:24 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 86777 invoked by uid 99); 23 Oct 2012 22:06:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Oct 2012 22:06:24 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of siraj@jobdiva.com designates 64.147.106.2 as permitted sender) Received: from [64.147.106.2] (HELO mail.jobdiva.com) (64.147.106.2) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Oct 2012 22:06:17 +0000 Received: from JOBDIVAEXMB1.jobdiva.local ([::1]) by jobdivaexf.jobdiva.local ([::1]) with mapi; Tue, 23 Oct 2012 18:05:55 -0400 From: Siraj Haider To: "java-user@lucene.apache.org" CC: "simon.willnauer@gmail.com" Date: Tue, 23 Oct 2012 18:05:51 -0400 Subject: RE: Scoring based on document Thread-Topic: Scoring based on document Thread-Index: Ac2xR/JjdaioY7z6QnO8LZ+ocUQ5qAAIlQXA Message-ID: <7276923E6E5AA04C97876BF29FF86716E071A894@jobdivaexmb1.jobdiva.local> References: <7276923E6E5AA04C97876BF29FF86716E071A7C4@jobdivaexmb1.jobdiva.local> <7276923E6E5AA04C97876BF29FF86716E071A7F4@jobdivaexmb1.jobdiva.local> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the suggestion, but in that scenario, I would lose the ability t= o search on individual fields, i.e. I would not be able to search on title = field only, and would end up with results where the searched term might be = in the description field. regards -Siraj (212) 306-0154 -----Original Message----- From: selvakumar netaji [mailto:vvekselva.gm@gmail.com] Sent: Tuesday, October 23, 2012 1:58 PM To: java-user@lucene.apache.org Cc: simon.willnauer@gmail.com Subject: Re: Scoring based on document Hi All, Just wanted to make sure that will approach would fails for this case. Having a copy field for each of the document, having the concatenated value= s of all the fields in that document and searching on the copy field would= just produce the result. The resulting docs would be based on the frequen= cy of the query terms in the whole document. On Tue, Oct 23, 2012 at 7:48 PM, Siraj Haider wrote: > So, just to confirm, using Lucene 4.0, we would be able to issue a > search on one or more fields and would be able to get the results > sorted by a custom field and also would be able to get the score of > each document based on the frequency of the terms searched in all the > indexed fields of that document (rather than getting it scored just by > the fields in the query, which is the case now). > > If this is true, could be please guide me into the direction on how to > implement it? > > Thanks a lot for your help. > > -Siraj > (212) 306-0154 > > -----Original Message----- > From: Simon Willnauer [mailto:simon.willnauer@gmail.com] > Sent: Tuesday, October 23, 2012 3:51 AM > To: java-user@lucene.apache.org > Subject: Re: Scoring based on document > > hey there, > > > in Lucene 4 you can override the termStatistics / CollectionStatistics > used for scoring in the IndexSearcher. You can take multiple fields > into account here in order use it for scoring. Here is the javadoc > link: > > > http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/Inde > xSearcher.html#termStatistics(org.apache.lucene.index.Term > , > org.apache.lucene.index.TermContext) > > http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/Inde > xSearcher.html#collectionStatistics(java.lang.String) > > > simon > On Mon, Oct 22, 2012 at 11:25 PM, Siraj Haider wrote: > > I am using DefaultSimilarity and did not boost any field while indexing= . > My index is comprised of the following fields: > > > > - Title > > > > - Author > > > > - Bookname > > > > - Description > > > > All of the 4 fields are indexed and can be searched on by the user. > > Now > let's say the user searches for "oracle" in Title field, the score is > computed based on the Title field only, and its disregarding the > frequency of the term "oracle" in other fields. It might be like that > by design but I need to change it so that the documents are ranked > based on the frequency in the whole document and not based on the field s= earches. Please help! > > > > Thanks in advance > > -Siraj > > > > > > ________________________________ > > This electronic mail message and any attachments may contain > > information > which is privileged, sensitive and/or otherwise exempt from disclosure > under applicable law. The information is intended only for the use of > the individual or entity named as the addressee above. If you are not > the intended recipient, you are hereby notified that any disclosure, > copying, distribution (electronic or otherwise) or forwarding of, or > the taking of any action in reliance on, the contents of this > transmission is strictly prohibited. If you have received this > electronic transmission in error, please notify us by telephone, > facsimile, or e-mail as noted above to arrange for the return of any elec= tronic mail or attachments. Thank You. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > > This electronic mail message and any attachments may contain > information which is privileged, sensitive and/or otherwise exempt > from disclosure under applicable law. The information is intended only > for the use of the individual or entity named as the addressee above. > If you are not the intended recipient, you are hereby notified that > any disclosure, copying, distribution (electronic or otherwise) or > forwarding of, or the taking of any action in reliance on, the > contents of this transmission is strictly prohibited. If you have > received this electronic transmission in error, please notify us by > telephone, facsimile, or e-mail as noted above to arrange for the return = of any electronic mail or attachments. Thank You. > This electronic mail message and any attachments may contain information wh= ich is privileged, sensitive and/or otherwise exempt from disclosure under = applicable law. The information is intended only for the use of the individ= ual or entity named as the addressee above. If you are not the intended rec= ipient, you are hereby notified that any disclosure, copying, distribution = (electronic or otherwise) or forwarding of, or the taking of any action in = reliance on, the contents of this transmission is strictly prohibited. If y= ou have received this electronic transmission in error, please notify us by= telephone, facsimile, or e-mail as noted above to arrange for the return o= f any electronic mail or attachments. Thank You. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org