lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Freudenberger" <d.freudenber...@trade-a-game.de>
Subject RE: boosting relevance of certain documents
Date Fri, 25 Apr 2008 19:50:46 GMT
Thanks for your response. I already knew that the relevance is based on the
term frequency but in some cases it's just not what the user expects. 
As I already mentioned, "fifa 2003 fifa 03" vs. "fifa 08" is such a case -
searching for "fifa" would return the "fifa 2003 fifa 03" document first but
the "fifa 08" document is more important (from the user's point of view).

Any suggestions?

Best regards,
Daniel
-----Original Message-----
From: Jonathan Ariel [mailto:ionathan@gmail.com] 
Sent: Friday, April 25, 2008 8:11 PM
To: java-user@lucene.apache.org
Subject: Re: boosting relevance of certain documents

Ok. So I'm not an expert of the scoring algorithm, but based on tf*idf you
can tell that the returned document is more relevant because it has more
term frequency.

Using the explain you can see the following:

Doc 1
0.643841 = (MATCH) fieldWeight(searchable:fifa in 0), product of:
  1.0 = tf(termFreq(searchable:fifa)=1)
  1.287682 = idf(docFreq=2)
  0.5 = fieldNorm(field=searchable, doc=0)

Doc2
0.68289655 = (MATCH) fieldWeight(searchable:fifa in 1), product of:
  1.4142135 = tf(termFreq(searchable:fifa)=2)
  1.287682 = idf(docFreq=2)
  0.375 = fieldNorm(field=searchable, doc=1)

On Fri, Apr 25, 2008 at 2:30 PM, Daniel Freudenberger <
d.freudenberger@trade-a-game.de> wrote:

> I'm using the StandardAnalyzer - hope this answers your question (I'm
> quite
> new to the lucene thing)
>
> -----Original Message-----
> From: Jonathan Ariel [mailto:ionathan@gmail.com]
> Sent: Friday, April 25, 2008 6:59 PM
> To: java-user@lucene.apache.org
> Subject: Re: boosting relevance of certain documents
>
> How are you analyzing the searchable field?
>
> On Fri, Apr 25, 2008 at 12:49 PM, Daniel Freudenberger <
> d.freudenberger@trade-a-game.de> wrote:
>
> > Hello,
> >
> >
> >
> > I'm using lucene within a new project and I'm not sure about how to
> solve
> > the following problem: My index consists of the two attributes "id" and
> > "searchable". "id" is the id of a product and "searchable" is a
> > combination
> > of the product name and its category name.
> >
> >
> >
> >  example:
> >
> >  id     searchable
> >
> >  1     fifa 08 - playstation 3
> >
> >  2     fifa 2003 fifa 03 - playstation 3
> >
> >  3     playstation 60gb hdd - playstation 3
> >
> >  4     playstation i like you - playstation 3
> >
> >
> >
> > When searching for "fifa", lucene returns the product with id 2 at
> first,
> > whereas id 1 ("fifa 08") would be the much more relevant result (from
> the
> > user side of view). the same problem arises when searching for
> > "playstation"
> > - the customer expects products having "playstation" in their names at
> > first, ideally the console itself. in reality however, he gets all
> > possible
> > products which are in the "playstation" category as well.
> >
> >
> >
> > my idea was to introduce another attribute relevance, which may increase
> > the
> > relevance of an entry. the actual relevance shouldn't be suppressed
> > completely though, but should only be taken into account with products
> > that
> > are similarly relevant for a specific search term.
> >
> >
> >
> > Does anybody have an idea on how to solve this problem?
> >
> >
> >
> > Thank you in advance,
> >
> > Daniel
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message