lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Bank <>
Subject Re: Creating tag clouds with lucene
Date Fri, 06 Nov 2009 08:25:52 GMT
Well, it could be a facet search, if there would be tags available but
if you just wanna have a "tag cloud" generated by full-text, I don't
see how a facet search could help to generate this cloud.
Unfortunatelly, I don't have tags in my data. What I need is the
information, what are the most used terms (or multi terms) in this
data. First I have thought of using carrot2, which uses a specialed
clustering algorithm. But I have wondered, if it is not possible to
get the most used terms out of lucene directly.

Glen has mentioned, that he is doing this for full-text data. He
mentioned that he is using the IndexReader.termDocs(Term term) method.
So I think he iterates all terms and looks in how many documents this
term exists. But what I don't see is: how does this method work with a
filter? Do you first look for all documents which are valid for the
used filter and than iterate all terms only counting documents in this
filtered set? I cannot imagine, that this is performant because I have
more than 10 mio documents (fast growing).


2009/11/6 Chris Lu <>:
> Isn't the tag cloud just another facet search? Only difference is the tag is
> multi-valued.
> Basically just go through the search results and find all unique tag values.
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site:
> demo:
> Lucene Database Search in 3 minutes:
> DBSight customer, a shopping comparison site, (anonymous per request) got
> 2.6 Million Euro funding!
> Mathias Bank wrote:
>> Hi,
>> I want to calculate a tag cload for search results. I have seen, that
>> it is possible to extract the top 20 words out of the lucene index. Is
>> there also a possibility to extract the top 20 words out of search
>> results (or filter results) in lucene?
>> Mathias
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message