Well .. the FieldCache API is documented here (for 2.4.0):
http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/search/FieldCache.html
EG you can load ints (for example) like this:
FieldCache.DEFAULT.getInts(reader, "myfield");
This returns an array mapping docID --> int value for that field. You
need to ensure that field has only 1 token per document (and that it
parses to an int, for this example).
But: it's slow to load a field for the first time. LUCENE-1231
(column-stride fields) aims to greatly speed up the load time.
It's also memory-consuming.
Finally, you might want to instead look at Solr, which provides facet
counting out of the box, rather than roll your own...
Mike
Stefan Trcek wrote:
> On Friday 07 November 2008 18:46:17 Michael McCandless wrote:
>>
>> Sorting populates the field cache (internal to Lucene) for that
>> field, meaning it loads all values for all docs and holds them in
>> memory. This makes the first query slow, and, consumes RAM, in
>> proportion to how large your index is.
>
> Can you direct me to the API how to access these cached values?
> I'd like to have a function like: "List all unique values of the
> categories (A, B, C...) for documents that match this query".
>
> i.e. for a query "text:john" show up categories=(A,B)
>
> Doc 1: category=A text=john
> Doc 2: category=B text=mary
> Doc 3: category=B text=john
> Doc 4: category=C text=mary
>
> This is intended for search refinement (I use about 200 categories).
> Sorry for hijacking this thread.
>
> Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|