lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: simple (?) question about scoring
Date Fri, 03 Nov 2006 00:49:13 GMT

: le list is not ordered (I do not know the details of the search
: angine, I only have its result for a query)
:
: then I have this list of documents, which represents a subset of the corpus
:
: I have to rank the documents of the list, using your scoring algorithm

In other words, out of a large copus C, this webservice hase
told you that the documents comprising subset S is the top N matching
documents for your query Q (where N << sizeof(C))

your goal is to sort S as best as possible.

You could try indexing all the docs in S in a Lucene RAMDirectory and then
search on them, but my orriginal point about the score being
fairly meaningless in an index of only 1 document still applies somewhat
... if all of the documents you get back allready have a lot in common
9they must have something in common or hte webservice wouldn't havre
returned them in response to your query) it may be hard to get a
meaningful document frequency of the words in your query.

you also may run into confusion about what exactly your query "is" and
wether or not your interpretation matches that of the webservice ... at a
very simplisitc level, if the query is "Java Lucene" and your webservice
interprets that as an "OR" query and you interpret that as an "AND" query,
you might find that the scores you compute for all the docs are 0.

If i were in your shoes, i'd try to work with whoever runs this webservice
to make it return more usefull informatoin -- at the very least to return
results in sorted order.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message