lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pablo Gomes Ludermir <>
Subject Re: categorized search
Date Thu, 05 May 2005 18:37:19 GMT

That was partially what I needed. You got it right when I said I
needed the number of categories that I particular term appears (and it
But, I also would like to know in how many documents in each category
that term appears.

For instance: title:lucene appears in the category "search engines"
and "open source software", and it appears in the documents 1, 2 and 3
in the category "search engines" and in documents 4 and 7 in the
categoy "open source". I could not get it to work yet (maybe because
of my lack of experience with Lucene).
Someone could give me a hand???

On 4/24/05, Chris Hostetter <> wrote:
> : >I have indexed a field that describes the "category" of the document.
> : >Thus, I want to know how many categories have a specific term. Could
> : >someone help me to get this with good performance?
> I think I'm reading this question different than Chuck, so I'll toss out
> somethign totally different...
> as I understand it, you've indexed a bunch of documents, with a variety of
> fields, one of which is "category" (for example, maybe you are indexing
> news articles, that each have a "title", "description", "url", and
> "category").  Now you have a term like "title:lucene" (or
> "description:pope") and you want to know the number of unique terms in the
> category field that exist in articles that contain your input term.
> If that's what you're looking for, then you can problem achieve this by:
>   1) make a TermQuery for your input term (ie: "title:lucene")
>   2) put that TermQuery in a QueryFilter, and call bits(reader)
>   3) call FieldCache.DEFAULT.getStrings(reader,"category")
>   3) loop over the true bits in the BitSet from #3, and for each one, add
>      the corrisponding entry from the String[] in #4 to a Set.
> when you're all done, the Set will be the list of categories, and the size
> of that Set is the number (i think) you wanted.
> (DISCLAIMER: I've never acctaully used FieldCache, i'm just giving you my
> advice based on reading the javadocs)
> -Hoss
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Pablo Gomes Ludermir

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message