Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm
Date: Sat, 6 Sep 2003 22:44:45 -0400
Subject: Re: Lucene features
Content-Type: text/plain; charset=US-ASCII; format=flowed
Mime-Version: 1.0 (Apple Message framework v552)
From: Erik Hatcher <erik@ehatchersolutions.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Content-Transfer-Encoding: 7bit
In-Reply-To: <3F592029.30301@seznam.cz>
Message-Id: <3DC24860-E0DD-11D7-ADF3-000393A564E6@ehatchersolutions.com>

On Friday, September 5, 2003, at 07:45  PM, Leo Galambos wrote:
>> And for the second time today.... QueryFilter.  It allows narrowing 
>> the documents queried to only the documents from a previous Query.
>
>
> I guess, it would not be an ideal solution - the first query does two 
> things a) it selects a subset from the corpus; b) it assigns a 
> relevance to each document of this subset. Your solution omits the 
> second point. It implies, the solution will not return "good hit 
> lists", because you will not consider the information value of the 
> first query which was given to you by a user.

Yes, you're right.  Getting the scores of a second query based on the 
scores of the first query is probably not trivial, but probably 
possible with Lucene.  And that combined with a QueryFilter would do 
the trick I suspect.  Somehow the scores of the first query could be 
remembered and used as a boost (or other type of factor) the scores of 
the second query.

Am I off base here?

> Thus I think, Chris would implement something more complex than 
> QueryFilter. If not, the results will be poorer than with the 
> commercial packages he may get. He could use a different model where 
> "AND" is not an associative operator (i.e. some modification of the 
> extended Boolean model). It implies, he would implement it in 
> Similarity.java (if I remember that class name correctly).

Right... but you'd still need the filtering capability as well, I would 
think - at least for performance reasons.

	Erik