lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Koch" <TheRan...@gmx.net>
Subject Re: About searching in multiple fields with one query
Date Mon, 14 Nov 2005 12:00:12 GMT
Hi Jian,

Are you sure of that? This would be quite a bad thing to do. I am refering
to the paper by Robertson at al

http://citeseer.ist.psu.edu/robertson04simple.html

in which it is shown that summing up of multiple scores violates a number of
basic assumptions in TF/IDF. Although it is shown on the BM25 algorithm it
is generalisable to a scoring function like the one used in Lucene.

Karl



> --- Urspr√ľngliche Nachricht ---
> Von: jian chen <chenjian1227@gmail.com>
> An: java-user@lucene.apache.org
> Betreff: Re: About searching in multiple fields with one query
> Datum: Sun, 13 Nov 2005 15:29:53 -0800
> 
> Hi, Karl,
> 
> Looking at the Lucene 1.2 source code, looks to me that the
> MultiFieldQueryParser generates a BooleanQuery. Each sub-query with the
> BooleanQuery is for one field. The actually calculation of the scoring is
> with BooleanScorer.java, where the scores from each sub-query is
> accumulated.
> 
> So, without going into more details of the BooleanScorer.java, it seems to
> me that the Lucene 1.2 creates a score from each field that is involved,
> and
> then, calculate a combined score (simple summation of the scores from each
> field).
> 
> I like the simplicity with Lucene 1.2, and am considering porting the
> compound file format back to Lucene 1.2 so it will be more robust.
> 
> Cheers,
> 
> Jian
> 
> On 11/13/05, Karl Koch <TheRanger@gmx.net> wrote:
> >
> > Hello all,
> >
> > I have a question about searching within multiple fields. I have the
> > following code for doing that (searchFields provides two fields in which
> I
> > want to search):
> >
> > IndexSearcher searcher = new IndexSearcher(indexDirectory);
> > // search over multiple index fields
> > Query query = MultiFieldQueryParser.parse(queryString, searchFields,
> > analyser);
> > hits = searcher.search(query);
> >
> > I am wondering how this is done internally (I am using Lucene 1.2). Does
> > Lucene 1.2 merge the terms of the two fields and create a single score
> > from
> > this? Or does Lucene 1.2 create a score from each field that is involved
> > and
> > then calculate a combined score from all those?
> >
> > Karl
> >
> > --
> > Telefonieren Sie schon oder sparen Sie noch?
> > NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 

-- 
Telefonieren Sie schon oder sparen Sie noch?
NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message