lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Retrieving TermVectors from a Field over the full index?
Date Sun, 10 Jun 2007 17:31:49 GMT
Um, to return all counts of all terms in a field, what other option
*is* there except to walk the whole thing?

Have you looked at TermEnum, TermDocs, and TermFreqVector?
For that matter, TermPositionVector might also be of some use.

It would be easier to provide some help if you
1> mentioned what you'd tried already
2> mentioned what's inadequate about what you've tried.

Best
Erick

On 6/9/07, Benjamin Pasero <bpasero@rssowl.org> wrote:
>
> Hi,
>
> I wonder if this is possible:
>
> Return all Terms of a Field in the Index together with the number of
> occurances
> in all documents.
>
> E.g. have 10 Documents with the Field "author" in the index, 5 of them
> having
> the value "foo" and 5 "bar" I would like to build a map with:
>
> [foo] -> 5
> [bar] -> 5
>
> I looked at what Luke is doing to show the top terms of a given field in
> the
> index and it seems to iterate over all terms (using
> IndexReader#terms()). Isnt
> that quite un-efficient? I would at least expect a method
> IndexReader#terms(String field)
> to limit the terms on the desired field.
>
> Thanks for helping,
> Ben
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message