lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steinert, Fabian" <Fabian.Stein...@fokus.fraunhofer.de>
Subject AW: High CPU usage duing index and search
Date Wed, 15 Aug 2007 22:41:24 GMT
Hi Chew,
 
with Lucene you could try the following:
 
Make one query for each single value in each category (each Term):
1Q - Gender:M
2Q - Department:Accounting
3Q - Department:R&D
4Q - ...
 
with a custom HitCollector like the following example taken from org.apache.lucene.search.HitCollector
API Spec:
 
   Searcher searcher = new IndexSearcher(indexReader);
   final BitSet bits = new BitSet(indexReader.maxDoc());
   searcher.search(query, new HitCollector() {
       public void collect(int doc, float score) {
         bits.set(doc);
       }
     });
 
Thus you'l get one BitSet for each Term with bits set at DocIDs containing the Term.
 
Then do the combinatory part on these BitSets like:
 
  BitSet FemaleAndRnD = ((BitSet) RnD.clone()).andNot(Male);

Cheers,
Fabian
 

________________________________

Von: Chew Yee Chuang [mailto:yeechuang@tecforte.com]
Gesendet: Mi 15.08.2007 10:31
An: java-user@lucene.apache.org
Betreff: RE: High CPU usage duing index and search



Greetings,

I have tested with Mysql, the grouping is ok when there is not much records in the table,
but when I come across to performed grouping in a table which have 3 millions of records,
It really take a very long time to finish. Thus, Im looking at lucene and hope it can help.

Thank you
eChuang, Chew




Mime
View raw message