lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Boosting results
Date Mon, 10 Nov 2008 12:55:31 GMT

Well .. the FieldCache API is documented here (for 2.4.0):

EG you can load ints (for example) like this:

     FieldCache.DEFAULT.getInts(reader, "myfield");

This returns an array mapping docID --> int value for that field.  You  
need to ensure that field has only 1 token per document (and that it  
parses to an int, for this example).

But: it's slow to load a field for the first time.  LUCENE-1231  
(column-stride fields) aims to greatly speed up the load time.

It's also memory-consuming.

Finally, you might want to instead look at Solr, which provides facet  
counting out of the box, rather than roll your own...


Stefan Trcek wrote:

> On Friday 07 November 2008 18:46:17 Michael McCandless wrote:
>> Sorting populates the field cache (internal to Lucene) for that
>> field,   meaning it loads all values for all docs and holds them in
>> memory. This makes the first query slow, and, consumes RAM, in
>> proportion to how large your index is.
> Can you direct me to the API how to access these cached values?
> I'd like to have a function like: "List all unique values of the
> categories (A, B, C...) for documents that match this query".
> i.e. for a query "text:john" show up categories=(A,B)
> Doc 1: category=A text=john
> Doc 2: category=B text=mary
> Doc 3: category=B text=john
> Doc 4: category=C text=mary
> This is intended for search refinement (I use about 200 categories).
> Sorry for hijacking this thread.
> Stefan
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message