lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Lynch <pabloly...@yahoo.com>
Subject Re: Advice on Custom Sorting
Date Mon, 25 Sep 2006 20:42:58 GMT
Thanks for the quick response Erick.

"index the documents in your preferred list with a 
field and index your non-preferred docs with a field
subid?"

I considered this approach and dismissed it due to the
actual list of preferred ids changing so frequently
(every 10 mins...ish) but maybe I was a little hasty
in doing so. I will investigate the overhead in
updating all docs in the index each time my list
refreshes. I had assumed it was too prohibitive but I
know what they say about assumptions :)

Should I be able to make this workable, the beauty of
this solution would be that I would actually only need
to query once. If I had a field which indicates
whether it is a preferred doc or not, "all" I will
have to do is sort across the two fields.

Thanks again Erick. Any other suggestions are most
welcome.

Regards,
Paul

--- Erick Erickson <erickerickson@gmail.com> wrote:

> OK, a really "off the top of my head" response, but
> what the heck....
> 
> I'm not sure you need to worry about filters. Would
> it work for you to index
> the documents in your preferred list with a  field
> (called, at the limit of
> my creativity, preferredsubid <G>) and index your
> non-preferred docs with a
> field subid? You'd still have to fire two queries,
> one on subid (to pick up
> the ones in your non-preferred list) and one on
> preferredsubid.
> 
> Since there's no requirement that all docs have the
> same fields, your
> preferred docs could have ONLY the preferredsubid
> field and your
> non-preferred docs ONLY the subid field. That way
> you wouldn't have to worry
> about picking the docs up twice.
> 
> Merging should be simple then, just iterate over
> however many hits you want
> in your preferredHits object, then tack on however
> many you want from your
> nonPreferredHits object. All the code for the two
> queries would be
> identical, the only difference being whether you
> specify "subid" or
> "preferredsubid"......
> 
> I can imagine several variations on this scenario,
> but they depend on your
> problem space.
> 
> Whether this is the "best" or not, I leave as an
> exercise for the reader.
> 
> Best
> Erick
> 
> On 9/25/06, Paul Lynch <pablolynch@yahoo.com> wrote:
> >
> > Hi All,
> >
> > I have an index containing documents which all
> have a
> > field called SubId which holds the ID of the
> > Subscriber that submitted the data. This field is
> > STORED and UN_TOKENIZED
> >
> > When I am querying the index, the user can cloose
> a
> > number of different ways to sort the Hits. The
> problem
> > is that I have a list of SubIds that should appear
> at
> > the top of the results list regardless of how the
> > index is sorted. In other words, lets suppose the
> Hits
> > should be sorted by DateAdded, I require the Hits
> to
> > be sorted by DateAdded for the SubIds in my list
> and
> > then by DateAdded for the SubIds not in my list.
> >
> > From reading previous discussions on the mailing
> list,
> > I believe I could achieve what I need by writing
> > custom filters i.e. Run the query first with a
> custom
> > filter for the SubIds in my list and then a second
> > time with a custom filter for the SubIds not in my
> > list and then "merge" the results.
> >
> > I suppose my question is simple: Is there a better
> way
> > to achieve this?
> >
> > Couple of bits of info which I would influence
> best
> > design:
> >
> > - Index contains roughly 5M documents
> > - There can be up to 10K different unique SubIds
> > - My "Preferred SubId List" could contain any
> > combination of the 10K SubIds including all or
> none of
> > them
> > - My "Preferred SubId List" gets updated about 10
> > times and hour so I could cache the custom filters
> >
> > Thanks in advance,
> > Paul
> >
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> >
> >
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message