lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samarendra Pratap <>
Subject Grouping results on the basis of a field
Date Mon, 21 Nov 2005 10:08:09 GMT
      I am using lucene 1.4.3. The basic functionality of  the search is simple, put in the
keyword as “java” and it will display  you all the books having java keyword.
  Now I have to add a feature which also shows the name of top authors  (lets say top 5 authors)
with the number of books, who has maximum  number of books in this result set.
      Currently I m traversing through all the results,  finding the value of author field
and putting them in a hash (to keep  count of authors and number of their books), but this
takes too much  time. Is there any efficient way of doing this?
      I have 150000 documents in my index. The field “author” can obviously contain more
than one value.
      My current code block for finding top 5 authors for maximum books is following.
              tfd =, null, nDocs, new Sort(sortCol));
                                       StringTokenizer fieldToken = null;
              HashMap fieldMap = new HashMap();
              for(int i=0;i<tfd.totalHits;i++)
                                                   fieldToken = new  StringTokenizer(searcher.doc(tfd.scoreDocs[i].doc).get(“author”);
                                                           String token = fieldToken.nextToken();
                                                           int count;
                           // increase the count by one
                                                                   count = Integer.parseInt(fieldMap.get(token).toString());
                                                                   fieldMap.put(token, new
                                                           catch(Exception e)
                           // put 1 in the new entry  of author
                                                                   fieldMap.put(token, new
              // here I get  the HashMap varialble (fieldMap), which I am sorting later on
to find  the top 5 hits for this result set.
  Thanks & Regards

Thanks & Regards,
 Enjoy this Diwali with Y! India Click here
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message