lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Lecarme <math...@garambrogne.net>
Subject Re: Several questions about scoring/sorting + random sorting in an image/related application
Date Fri, 15 Jun 2007 15:42:25 GMT
First step is to feed a Set with "collection"
Second step is to sort it.

With a sortedSet, you can do that, isnt'it?

M.


Antoine Baudoux a écrit :
> Could-you be more precise? I dont understand what you mean.
>
>
>
> On 15 Jun 2007, at 17:20, Mathieu Lecarme wrote:
>
>> Your request seems to be a two steps query.
>> First step, you select image, and then collection
>> Second step, you sort collection.
>>
>> BitVector can help you?
>>
>> M.
>> Antoine Baudoux a écrit :
>>>     Hi,
>>>
>>>     I'm developping an image database. Each lucene document
>>> representing an image contains (among other fields ):
>>>
>>>     - a date field
>>>     - a collection field containing the ID of the collection the image
>>> belongs to.
>>>
>>>     I want to be able to give a score to each collection. Collections
>>> with a higher score appear first in the results. I want to avoid
>>> re-indexing all the documents each time i change my collection scores.
>>>
>>>     For example on day 1 I decide to give collection #1 a 5 score and
>>> collection #3 a 10 score --> images belonging to collection #3 appear
>>> first in search results.
>>>     One day 2 i give collection #3 a 2 score --> images belonging to
>>> collection #1 appear first in search results.
>>>
>>>     I have read the lucene docs, and from what i understand there are
>>> many ways to achieve what I want :
>>>
>>>
>>>     - I can use a Very big Boolean query (OR query in fact) containing
>>> one TermQuery per collection ID, setting the correct boost factor for
>>> each termquery. The problem with this is that i have 300 collections,
>>> so i have a boolean query with 300 terms that i append to each query i
>>> make. I am afraid that it will be slow.
>>>
>>>     - I can use a ValueSourceQuery, where for each document i compute
>>> a custom score based on the value of the collection field. Will it be
>>> faster than the first solution?
>>>
>>>     - I can do advanced things such as writing a custom HitCollector,
>>> or a custom Query.
>>>
>>>     - I can add another field to each document, containing a computed
>>> custom score, then i could sort on that field. But i want to avoid
>>> this solution at all costs, since it would mean re-indexing all the
>>> documents each time the collection scores change.
>>>
>>>     What solution do you suggest?  Is there another solution that i
>>> didnt mention?
>>>
>>>     More recent documents should also come first : In fact the final
>>> sorting should be a ponderated sum between the collection score of an
>>> image and the date of an image : most recent images from the
>>> best-scored collections come first, then most recent from less-scrored
>>> collections, then less recent from best scored, and so on. I would
>>> also like to be able to adjust the balance between date/collection
>>> score.
>>>
>>>     What solution do you suggest?
>>>
>>>
>>>     I would also like to implement random-sorting. My solution is : i
>>> create 12 new fields R1 -> R12 for each document, each containing a
>>> random number between 1 and 12. To get a random sort, i sort each day
>>> with a different combination of R1 .. R12. For example :
>>>
>>>     Day 1 : i sort by R1 then R4 then R5..
>>>     Day 2 : i sort by R10 then R9 then R2....
>>>     etc...
>>>
>>> Is it a good solution? Is there another way to do it?
>>>
>>>
>>>     Very big thx in advance for your answers.
>>>
>>> Antoine
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message