>> But Drill Down searching is very desirable. It's where you're able to
>> search
>> within the results of a previous search. I'm assuming that I'll have to
>> implement that myself, by keeping a copy of the previous Hits list,
>> and only
>> returning results that are in both lists.
>
>
> And for the second time today.... QueryFilter. It allows narrowing
> the documents queried to only the documents from a previous Query.
I guess, it would not be an ideal solution - the first query does two
things a) it selects a subset from the corpus; b) it assigns a relevance
to each document of this subset. Your solution omits the second point.
It implies, the solution will not return "good hit lists", because you
will not consider the information value of the first query which was
given to you by a user.
For instance, "neologism" > "George Bush" (1st>2nd query) would return
different order of hits than "George Bush" > "neologism". Other
examples, "Prague Berlin" > "flight" (I must go there, and I prefer an
airplane) versus "flight" > "Prague Berlin" (I must fly, and I prefer
Berlin).
Thus I think, Chris would implement something more complex than
QueryFilter. If not, the results will be poorer than with the commercial
packages he may get. He could use a different model where "AND" is not
an associative operator (i.e. some modification of the extended Boolean
model). It implies, he would implement it in Similarity.java (if I
remember that class name correctly).
Leo
|