lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Fetch Documents Without Retrieveing All Fields
Date Sun, 09 Apr 2006 23:21:43 GMT

first off, i only skimmed the url you posted. i may have missed the point,
but it appears to be a description of how to add restricted stored field


: Of course there is no doubt that search in Lucene index is faster but
: sometimes the retrieving the hitDocs is slower(for Ex. when we try to
: retrieve more than 10000 documents  from hits). May be my scenario is a most certainly do *NOT* want to use the Hits class if you are
access more then a few hundrad of the "first" results (be it by score, or
by some sorting option).  Hits will re-execute your search many times as
you walk farther down the results (it is designed for a simplified use
case which you certainly do not meet)

you most likely want to use one of the "Expert" methods ... either TopDocs
or TopFieldDocs if you are sing sorting, or a HitCollector if you can get
away with it.


: I have integrated lucene search engine for one of our project in the
: company, there I have a book index with each document has more than 15
: fields to do speific search, but out of that after I do the search I
: just want to retrieve the value of one field named "DBID" which is the
: database table column id and for rendering in the frontend I retrieve
: the data from database. In this case I really don't require all the

1) if the only field you ever need *after* a search is the DBID field,
then make sure you aren't STOREing any fields but that one -- it will make
your index smaller.

2) your use case sounds like it could best be served by leveraging the
FieldCache -- as long as each document contains only one value for the
DBID field, and as long as you index the DBID field, you can use the
FieldCache for that field (along with a HitCollector, or
TopDocs/TopFieldDocs) to access the DBID of every doc much faster then you
can get the stored value.

I am 99.999% certain that using the FieldCache to get the one indexed
field valueyou want is going to be faster then any approach you might take
which relies on the stored field value -- without or without the
modifications described in that URL.  (and this way, you don't need to
manually patch your version of Lucene in a way that will be hard to
support in the future)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message