lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Weird time results doing wildcard queries
Date Fri, 09 Sep 2005 03:01:24 GMT

: Which makes me wonder whether the caching logic of Hits, optimized for
: random- rather than linear-access, and not tuneable or controllable in
: 1.4.3, should be reviewed for a subsequent release, at least the
: API-breaking 2.0.  I'll wager that a majority of applications do nothing
: other than a one-time linear retrieval of Documents from Hits, with the
: potential for a lot of wasted cycles for those that retrieve more than a
: small number.

I agree it should be more tunable, but I disagree with your wager.  I
suspect that there are a lot of stateless applications out there that
support "paginated results".  For those that only every access one or two
pages and have small page size, the current Hits works well (and i suspect
that is what it was optimized for)

What doesn't make sense to me is that the constructor allways fetches the
first 100 -- which is a waste if the application is currently intersted in
results 101 and up.

Off the top of my head, I would imagine that a usefull set of API changes
would be...

 * add Hits.setRetrievalFactor(float); // replace "2" in getMoreDocs
 * add Hits.setDocCacheSize(int); // modify Hits.maxDocs
 * make Hits.getMoreDocs(int) package protected
 * add Searcher.makeHits(Query,Filter,Sort); // use in search, override in subclasses
 * move the call to getMoreDocs(int) from Hits to Searcher.search

...that way the behavior stays the same, there are no major API changes,
and applications that want to customize the amount of caching/prefecthing
can do so my subclassing (Index)Searcher with some very simple method
overrides.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message