lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "wuqi" <>
Subject Problems about using Lucene to generate tag cloud..
Date Mon, 31 Mar 2008 08:00:35 GMT
I am trying to use Lucene index to implement a tag cloud  system. I add a new field  named
"tags" in index to  store all the tags,and we don't support tags with more than one word,
so different tags of the same document just are separate by white space.  The "tags" filed
in one document  may looks like this :
doc1  tags : travel Beijing  news
doc2  tags:  beijing sports news
I can easily retrieve tags related with single document,and also get the documents related
with certain tag, but it's hard  find a "efficient" way to  get frequent tags  from a "set"
of documents of this index.Tthe set of the documents is always generated dynamically, may
be a search result, a  dynamically generated category through clustering. The document set
is very large, maybe several ten thousands or several hundred thousands.So simply  iterate
all  the documents in the set and find the frequent tags might not be applicable.Do you have
any better idea ?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message