Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (athena.apache.org: domain of lahiruts@gmail.com designates
 74.125.82.48 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:x-goomoji-body:date:message-id
         :subject:from:to:content-type;
        b=XZVdxfCyhYEOtiWdMU8WYmwh9BqoexM0qe1dVUEcIjz33PhWIHAEm0de/TUnv3loso
         7BNeyj2eE/AHhh5iAcIIZpBbY5W+I9yltMAb2cDaJSUElRUc4YJFKLWkWiTGxKedDjDA
         rzTjVFAAOE8SdvaBrN1RP6DWA9rQDAhH6d84g=
MIME-Version: 1.0
In-Reply-To: <BANLkTikYvdsvRkuWEg4XxnMX8ZwaEuPTUQ@mail.gmail.com>
References: <BANLkTimx2LkWXvyEDJyvzL4QUd6Mno=sDw@mail.gmail.com>
	<BANLkTinTtCbeDT_ZnNqCQnPq0-kjxCMJaw@mail.gmail.com>
	<BANLkTi==nKaAk5pntu7s=gXqeDQ_uxurrA@mail.gmail.com>
	<BANLkTikYvdsvRkuWEg4XxnMX8ZwaEuPTUQ@mail.gmail.com>
Date: Tue, 14 Jun 2011 10:41:37 +0530
Message-ID: <BANLkTi=KiC3sJHyC_JDYy2C2F_S6c54CWw@mail.gmail.com>
Subject: Re: Modifying Length Normalization calculation
From: Lahiru Samarakoon <lahiruts@gmail.com>
To: java-user@lucene.apache.org
Content-Type: multipart/related; boundary=e0cb4e38500c0a30f804a5a51292

--e0cb4e38500c0a30f804a5a51292
Content-Type: multipart/alternative; boundary=e0cb4e38500c0a30f504a5a51291

--e0cb4e38500c0a30f504a5a51291
Content-Type: text/plain; charset=ISO-8859-1

Hi Ian,

The order is right and your method is working for me.

Thanks  [?]

Lahiru

On Mon, Jun 13, 2011 at 7:15 PM, Ian Lea <ian.lea@gmail.com> wrote:

> This is getting beyond my level of expertise, but I'll have a go at
> your questions.  Hopefully someone better informed will step in with
> corrections or confirmation.
>
> > ...
> > The application calls the *writer.addDocument(d);* method and in this
> > process the *lengthNorm(String fieldName, int numTerms)*  method is
> called.
> > I can extend the *DefaultSimilarity* class and override the
> > *lengthNorm*method, but how can I explicitly specify the
> > *numTerms* value?
>
> I don't know that you can, but you don't have to use the value passed in.
>
> > ...
> > Does *computeNorm* method is called for every field or is it only called
> for
> > analyzed fields?
>
> All indexed fields, at a guess.  Which can be analyzed or not.
>
> > The order we call *addDocument* and the order the *computeNorm *method is
> > called is the same ?
>
> Probably.
>
> > Is there is a possibility that I can access the *Document* object inside
> the
> > *Similiarity* class ?
>
> Not that I know of via API calls. If you had your own Similarity
> implementation, and methods are called in the order you expect, you
> could add a setDoc(Document) method and/or a setCalcValue(n) method
> and use them as you wished in your custom computeNorm() or
> lengthNorm() code.
>
>
> --
> Ian.
>
>
> > On Mon, Jun 13, 2011 at 3:09 PM, Ian Lea <ian.lea@gmail.com> wrote:
> >
> >> org.apache.lucene.search.Similarity would be the place to look,
> >> specifically computeNorm(String field, FieldInvertState state).  There
> >> is comprehensive info in the javadocs.  Note that values are
> >> calculated at indexing and stored in the index encoded, with some loss
> >> of precision.
> >>
> >>
> >> --
> >> Ian.
> >>
> >> On Mon, Jun 13, 2011 at 7:31 AM, Lahiru Samarakoon <lahiruts@gmail.com>
> >> wrote:
> >> > Hi All,
> >> >
> >> > I want to change the length normalization calculation specific to my
> >> > application. By changing the "*number of terms*" according to my
> >> > requirement. The "*StandardTokenizer*" works perfectly for my
> >> application,
> >> > However, the *number of terms* calculated by the tokenizer is not the
> >> > effective number of terms for the application. I have an mechanism to
> >> > calculate that value and I need to know how can I apply that value in
> >> length
> >> > normalization calculations.
> >> >
> >> > Please advice.
> >> >
> >> > Thank you,
> >> >
> >> > Best Regards,
> >> > Lahiru.
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

--e0cb4e38500c0a30f504a5a51291
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi Ian,<br><br>The order is right and your method is working for me.<br><br=
>Thanks=A0 <img goomoji=3D"330" style=3D"margin: 0pt 0.2ex; vertical-align:=
 middle;" src=3D"cid:330@goomoji.gmail"><br><br>Lahiru<br><br><div class=3D=
"gmail_quote">
On Mon, Jun 13, 2011 at 7:15 PM, Ian Lea <span dir=3D"ltr">&lt;<a href=3D"m=
ailto:ian.lea@gmail.com">ian.lea@gmail.com</a>&gt;</span> wrote:<br><blockq=
uote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left:=
 1px solid rgb(204, 204, 204); padding-left: 1ex;">
This is getting beyond my level of expertise, but I&#39;ll have a go at<br>
your questions. =A0Hopefully someone better informed will step in with<br>
corrections or confirmation.<br>
<br>
&gt; ...<br>
<div class=3D"im">&gt; The application calls the *writer.addDocument(d);* m=
ethod and in this<br>
&gt; process the *lengthNorm(String fieldName, int numTerms)* =A0method is =
called.<br>
&gt; I can extend the *DefaultSimilarity* class and override the<br>
&gt; *lengthNorm*method, but how can I explicitly specify the<br>
&gt; *numTerms* value?<br>
<br>
</div>I don&#39;t know that you can, but you don&#39;t have to use the valu=
e passed in.<br>
<br>
&gt; ...<br>
<div class=3D"im">&gt; Does *computeNorm* method is called for every field =
or is it only called for<br>
&gt; analyzed fields?<br>
<br>
</div>All indexed fields, at a guess. =A0Which can be analyzed or not.<br>
<div class=3D"im"><br>
&gt; The order we call *addDocument* and the order the *computeNorm *method=
 is<br>
&gt; called is the same ?<br>
<br>
</div>Probably.<br>
<div class=3D"im"><br>
&gt; Is there is a possibility that I can access the *Document* object insi=
de the<br>
&gt; *Similiarity* class ?<br>
<br>
</div>Not that I know of via API calls. If you had your own Similarity<br>
implementation, and methods are called in the order you expect, you<br>
could add a setDoc(Document) method and/or a setCalcValue(n) method<br>
and use them as you wished in your custom computeNorm() or<br>
lengthNorm() code.<br>
<div><div></div><div class=3D"h5"><br>
<br>
--<br>
Ian.<br>
<br>
<br>
&gt; On Mon, Jun 13, 2011 at 3:09 PM, Ian Lea &lt;<a href=3D"mailto:ian.lea=
@gmail.com">ian.lea@gmail.com</a>&gt; wrote:<br>
&gt;<br>
&gt;&gt; org.apache.lucene.search.Similarity would be the place to look,<br=
>
&gt;&gt; specifically computeNorm(String field, FieldInvertState state). =
=A0There<br>
&gt;&gt; is comprehensive info in the javadocs. =A0Note that values are<br>
&gt;&gt; calculated at indexing and stored in the index encoded, with some =
loss<br>
&gt;&gt; of precision.<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; --<br>
&gt;&gt; Ian.<br>
&gt;&gt;<br>
&gt;&gt; On Mon, Jun 13, 2011 at 7:31 AM, Lahiru Samarakoon &lt;<a href=3D"=
mailto:lahiruts@gmail.com">lahiruts@gmail.com</a>&gt;<br>
&gt;&gt; wrote:<br>
&gt;&gt; &gt; Hi All,<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; I want to change the length normalization calculation specifi=
c to my<br>
&gt;&gt; &gt; application. By changing the &quot;*number of terms*&quot; ac=
cording to my<br>
&gt;&gt; &gt; requirement. The &quot;*StandardTokenizer*&quot; works perfec=
tly for my<br>
&gt;&gt; application,<br>
&gt;&gt; &gt; However, the *number of terms* calculated by the tokenizer is=
 not the<br>
&gt;&gt; &gt; effective number of terms for the application. I have an mech=
anism to<br>
&gt;&gt; &gt; calculate that value and I need to know how can I apply that =
value in<br>
&gt;&gt; length<br>
&gt;&gt; &gt; normalization calculations.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; Please advice.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; Thank you,<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; Best Regards,<br>
&gt;&gt; &gt; Lahiru.<br>
&gt;&gt; &gt;<br>
&gt;&gt;<br>
&gt;&gt; ------------------------------------------------------------------=
---<br>
&gt;&gt; To unsubscribe, e-mail: <a href=3D"mailto:java-user-unsubscribe@lu=
cene.apache.org">java-user-unsubscribe@lucene.apache.org</a><br>
&gt;&gt; For additional commands, e-mail: <a href=3D"mailto:java-user-help@=
lucene.apache.org">java-user-help@lucene.apache.org</a><br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;<br>
<br>
---------------------------------------------------------------------<br>
To unsubscribe, e-mail: <a href=3D"mailto:java-user-unsubscribe@lucene.apac=
he.org">java-user-unsubscribe@lucene.apache.org</a><br>
For additional commands, e-mail: <a href=3D"mailto:java-user-help@lucene.ap=
ache.org">java-user-help@lucene.apache.org</a><br>
<br>
</div></div></blockquote></div><br>

--e0cb4e38500c0a30f504a5a51291--

--e0cb4e38500c0a30f804a5a51292--