From java-user-return-42017-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Tue Aug 25 18:09:52 2009 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 34986 invoked from network); 25 Aug 2009 18:09:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Aug 2009 18:09:52 -0000 Received: (qmail 51192 invoked by uid 500); 25 Aug 2009 18:10:15 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 51144 invoked by uid 500); 25 Aug 2009 18:10:15 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 51119 invoked by uid 99); 25 Aug 2009 18:10:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Aug 2009 18:10:15 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of fabriciorsf@gmail.com designates 74.125.92.25 as permitted sender) Received: from [74.125.92.25] (HELO qw-out-2122.google.com) (74.125.92.25) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Aug 2009 18:10:03 +0000 Received: by qw-out-2122.google.com with SMTP id 8so1905154qwh.53 for ; Tue, 25 Aug 2009 11:09:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=42KZzmnQGVyU1Dq311aGyxo3YjOG+vwCGnMa+8wjO9o=; b=CVwcww+RBU41rgsRJoILz8MuXEeBF6sgDfnPDD2UeDD2NGYqJHtY8CWIf5nrpEvZvV NcpuVlUvxBg9/bLiSQ2acK4ardIcooasbSzWRjur10ZmZ+ijkL7EcVu2LYLsEDX6lZVE DLf9/34f0UlCiqHKA9fr7IbOqgFg03re7T0H4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=aVIAz2lIArbOd95vbXFdpZK3sbMZCdMXnV27UEMsAXBq1Dp3/OdUSCXBEsKac3dJco 1+U1yG0Gg/bdbG+Mh5Rr4zXbt2DjePdIq4HPcCTu5/TAMd8+CECSFUB5DGMptXDuaR5N ZJzpBbti3rHxXn8n3gilGiv5RfmXcEUzf8TIo= MIME-Version: 1.0 Received: by 10.229.39.69 with SMTP id f5mr1758462qce.107.1251223782079; Tue, 25 Aug 2009 11:09:42 -0700 (PDT) In-Reply-To: References: <786fde50908250917w515f1ff1tc9d03f60e7637557@mail.gmail.com> Date: Tue, 25 Aug 2009 15:09:42 -0300 Message-ID: Subject: Re: How to give a score for all documents? From: =?ISO-8859-1?Q?Fabr=EDcio_Raphael?= To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016364275f710bdfe0471fb3d04 X-Virus-Checked: Checked by ClamAV on apache.org --0016364275f710bdfe0471fb3d04 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I am continuing a work about wavelets in IR. In the bellow article you will to find a example. http://www.ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=3D4740460&i= snumber=3D4740405&punumber=3D4740404&k2dockey=3D4740460@ieeecnfs&query=3D%2= 8%28using+wavelets+to+classify+documents%29%3Cin%3Eti+%29&pos=3D0&access=3D= no Att, On Tue, Aug 25, 2009 at 2:57 PM, Simon Willnauer < simon.willnauer@googlemail.com> wrote: > Hi Fabricio, > > I will try to recap what you are trying to say... > > you IR model does score documents that would not be returned by a > particular query. So you have some other indicator that make a > document relevant?! If it is not a term could you give use an example? > How would you decide if a doc is relevant or what would make it > "scorable"? > > I guess an example would help :) > > simon > > 2009/8/25 Fabr=EDcio Raphael : > > First, that a document is relevant to a query does not necessarily mean > that > > this document has to contain some query term. You can have other ways t= o > > assert that a document is relevant to a query. > > > > My IR model is different of the vector model, so it can to give score n= ot > > null for documents irrelevant for the vector model. I know that Lucene > > implements the vector model, but I want to use the facilities of the > Lucene > > because I like what the Lucene provides. > > > > But the Lucene to give scores only for relevant documents for the vecto= r > > model. And the my model can to give score same that this documents isn'= t > > relevant to vector model. It depends of the configuration granularity o= f > > execution. > > > > So I liked that method nextDoc() of the class that implements the > > Scorer.class returned all the documents the end of the iteration to > > calculate the score. > > > > I've got to calculate the customized score of the documents that Lucene > > returns in according to the vector model. > > > > I hope you have understood me! > > > > Thanks! > > > > > > On Tue, Aug 25, 2009 at 1:17 PM, Shai Erera wrote: > > > >> Can you please elaborate more on the use case? Why if a certain docume= nt > is > >> irrelevant to a certain query, you'd like to give it a score? Are you > >> perhaps talking about certain documents which should always appear in > >> search > >> results, no matter what the query is? And instead of always showing > them, > >> you'd like to give them a "static score", so that they can compete w/ > other > >> docs? > >> > >> If that's the case, I think you can use a BooleanQuery such that the > user > >> query is added as a clause and then you add another clause (MUST) whic= h > is > >> in fact a MatchAllDocsQuery or something like that which returns a > >> customized score. It's expensive though as for each query you'll score > all > >> docs in the index. > >> > >> But I don't think that will help (at least for this use case) since > every > >> relevant document to the query will be added the same score as an > >> 'irrelevant' document, which means the relevant docs will still win, n= o? > >> > >> Shai > >> > >> 2009/8/25 Fabr=EDcio Raphael > >> > >> > I already know about this, but I want to give a customized score for > all > >> > documents in collection, independent if wache document is or isn't > >> relevant > >> > to the vector model. > >> > > >> > The similarity function is called only when the document is relevant > to > >> the > >> > vector model. > >> > > >> > Do you understand me? > >> > > >> > Thanks! > >> > > >> > On Sat, Aug 22, 2009 at 2:28 AM, prashant ullegaddi < > >> > prashullegaddi@gmail.com> wrote: > >> > > >> > > If you want to modify the way Lucene scores documents, I guess you > need > >> > to > >> > > extend Similarity class and provide your own implementation. Take = a > >> look > >> > > at: > >> > > > >> > > > >> > > > >> > > >> > http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/DefaultS= imilarity.html > >> > > > >> > > > >> > > >> > http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/Similari= ty.html > >> > > > >> > > Thanks, > >> > > Prashant. > >> > > > >> > > 2009/8/21 Fabr=EDcio Raphael > >> > > > >> > > > How to give a customize score for all documents independent of t= he > >> > vector > >> > > > model? > >> > > > > >> > > > I already know how to give a customize score, but I want to give > this > >> > > > customize score for all documents in the collection, regardless = of > >> what > >> > > is > >> > > > relevant to the vector model. > >> > > > > >> > > > How to do this? > >> > > > > >> > > > Now, thanks! > >> > > > > >> > > > -- > >> > > > Fabr=EDcio Raphael > >> > > > > >> > > > >> > > >> > > >> > > >> > -- > >> > Fabr=EDcio Raphael > >> > > >> > > > > > > > > -- > > Fabr=EDcio Raphael > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --=20 Fabr=EDcio Raphael --0016364275f710bdfe0471fb3d04--