lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: pre computing possible search results narrowing and hit counts on those
Date Thu, 31 Mar 2005 13:47:28 GMT
As suggested before, the real killer for performance
in Lucene is when you have to read a stored field. It
doesn't matter how small the one field you want is,
Lucene will read *all* fields for that document off
the disk. If you have a large "body" field that will
get read too when you try read the "state" stored
field.

Your example code in the last post will fall foul of
this. The two proposed solutions (Doug's and mine)
avoid the need to read *stored* fields by making use
of the *indexed* part of your "state" field. TermDocs
is the way to find all the docs that are related to
the value of an indexed field eg all docs in the state
"Texas". 
Doug's suggested solution uses a cached mapping of
docIds to field values (using FieldCache) whereas my
approach re-reads the termDocs each time. 
I have found re-reading termDocs to be pretty fast but
you may want to sacrifice some memory for extra speed
- your choice.

Cheers,
Mark


Send instant messages to your online friends http://uk.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message