lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ole-Martin Mørk <olemar...@gmail.com>
Subject Re: Help understanding fieldNorm
Date Mon, 05 Oct 2009 13:42:39 GMT
Thanks. It might be that Nutch sets some values. I am not able to find
anything in the config files though.
We are using nutch' solrindex.

--
Ole-Martin Mørk
http://twitter.com/olemartin
http://flickr.com/olemartin


On Mon, Oct 5, 2009 at 2:28 PM, Simon Willnauer <
simon.willnauer@googlemail.com> wrote:

> Have a look at your schema definition I guess thats the place where boosts
> are set if not defined in the data you send to you solr instance.
>
> simon
>
> On Mon, Oct 5, 2009 at 2:14 PM, Ole-Martin Mørk <olemartin@gmail.com>
> wrote:
>
> > That might be true. The document boost did not change, but maybe the
> field
> > boost changed. Is it possible to retrieve the field boost from solr?
> > --
> > Ole-Martin Mørk
> >
> >
> > On Mon, Oct 5, 2009 at 2:01 PM, Simon Willnauer <
> > simon.willnauer@googlemail.com> wrote:
> >
> > > I still guess that the document has been indexed with different boost
> > > factors the first time if you did not change the length of the URL.
> > > Can you make sure this did not happen?
> > >
> > > simon
> > >
> > > On Mon, Oct 5, 2009 at 12:45 PM, Ole-Martin Mørk <olemartin@gmail.com
> > > >wrote:
> > >
> > > > I did not change the url. The length of the title was increased by 1,
> > > from
> > > > 41 to 42 characters.
> > > > --
> > > > Ole-Martin Mørk
> > > >
> > > >
> > > > On Mon, Oct 5, 2009 at 12:39 PM, Karl Wettin <karl.wettin@gmail.com>
> > > > wrote:
> > > >
> > > > > sorry, I ment title.
> > > > >
> > > > > 5 okt 2009 kl. 11.57 skrev Simon Willnauer:
> > > > >
> > > > >
> > > > >  Ole-Martin, did you mention that you did not change the URL value
> > but
> > > > the
> > > > >> title?
> > > > >>
> > > > >> simon
> > > > >>
> > > > >> On Mon, Oct 5, 2009 at 11:52 AM, Karl Wettin <
> karl.wettin@gmail.com
> > >
> > > > >> wrote:
> > > > >>
> > > > >>  Hi Ole-Martin,
> > > > >>>
> > > > >>> how many characters was it in the url in before and after
update?
> > > > >>>
> > > > >>>
> > > > >>>   karl
> > > > >>>
> > > > >>> 5 okt 2009 kl. 10.21 skrev Ole-Martin Mørk:
> > > > >>>
> > > > >>>
> > > > >>> Hi. I am trying to understand Lucene's scoring algorithm.
We're
> > > > >>>
> > > > >>>> getting some strange results. First we search for a given
page
> by
> > > it's
> > > > >>>> url. We get this result:
> > > > >>>>
> > > > >>>> 0.0014793393 = fieldWeight(url:"our super secret url"
in 22),
> > > product
> > > > >>>> of:
> > > > >>>> 1.0 = tf(phraseFreq=1.0)
> > > > >>>> 32.31666 = idf(url: www=7327 host=321 com=7327 article=2456
> > > > >>>> something=2 something=44 704290075=1)
> > > > >>>> 4.5776367E-5 = fieldNorm(field=url, doc=22)
> > > > >>>>
> > > > >>>> When this is done, we use solrJ to read and write the
document.
> > The
> > > > >>>> only change is the title of the document (appends the
number 2)
> > > > >>>>
> > > > >>>> We search again and the fieldNorm is changed significantly:
> > > > >>>>
> > > > >>>> 9.874598 = fieldWeight(url:"our super secret url" in
0), product
> > of:
> > > > >>>> 1.0 = tf(phraseFreq=1.0)
> > > > >>>> 31.598713 = idf(url: www=7328 host=322 com=7328 article=2457
> > > > >>>> something=3 somthing=45 704290075=2)
> > > > >>>> 0.3125 = fieldNorm(field=url, doc=0)
> > > > >>>>
> > > > >>>> Why does the value of fieldNorm change so much?
> > > > >>>>
> > > > >>>> Looking forward to your answers.
> > > > >>>>
> > > > >>>> --
> > > > >>>> Ole-Martin Mørk
> > > > >>>> http://twitter.com/olemartin
> > > > >>>> http://flickr.com/olemartin
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>>
> > ---------------------------------------------------------------------
> > > > >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > >>> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message