Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 63261 invoked from network); 30 Nov 2003 17:55:20 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 30 Nov 2003 17:55:20 -0000 Received: (qmail 48113 invoked by uid 500); 30 Nov 2003 17:55:09 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 47864 invoked by uid 500); 30 Nov 2003 17:55:07 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 47851 invoked from network); 30 Nov 2003 17:55:07 -0000 Received: from unknown (HELO web11506.mail.yahoo.com) (216.136.172.38) by daedalus.apache.org with SMTP; 30 Nov 2003 17:55:07 -0000 Message-ID: <20031130175511.57392.qmail@web11506.mail.yahoo.com> Received: from [80.138.189.113] by web11506.mail.yahoo.com via HTTP; Sun, 30 Nov 2003 09:55:11 PST Date: Sun, 30 Nov 2003 09:55:11 -0800 (PST) From: Kent Gibson Subject: Re: raw hit count To: Lucene Users List In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Thanks a mil erik, I tried to make my own filter class with a modified bit method as per below: if (doc == interestingDoc) { bits.set(doc); // set bit for hit } but this baby continues to always return 1! so then I looked at indexreader, like you said and ended up with something like this, its probably a messy way of doing it, but I am happy. Term term = new Term("body", "mercedes"); IndexReader ir = IndexReader.open(indexPath); TermDocs termdocs = ir.termDocs(term); int id = hits.id(i); while (termdocs.next()) { if (termdocs.doc() == id) { System.out.println( "Document number " + termdocs.doc() + " Freq: " + termdocs.freq()); } } it only works for single words, but I reckon I just send split up the query and then make multiple scans. cheers kent --- Erik Hatcher wrote: > On Sunday, November 30, 2003, at 11:13 AM, Kent > Gibson wrote: > > as per Erik's idea I tried with the BitSet as > follows: > > > > QueryFilter qf = new QueryFilter(query); > > IndexReader ir = IndexReader.open(indexPath); > > Searcher searcher2 = new IndexSearcher(ir); > > > > // get the bit set for the query > > BitSet bits = qf.bits(ir); > > I did not mean to imply for you to call the bits > method in this manner. > In fact, you should not call it - the > IndexSearcher calls it under the > covers. I was implying that you could write your > own Filter subclass > that lit up a single-bit corresponding to the > document you're > interested in. > > > However I always get a result of 1, which I > suppose is > > has to do with this overlap thingy. > > No, not related with respect to a filter - two > different concepts. > > > Is there not a simple way to just get some word > > statistics out of a file? > > Look at the Lucene index format (from Lucene's main > web page). Term > frequencies are part of the statistics gathered, of > course. You can > get at the values there using IndexReader. This may > be a lot > lower-level than you desire, but what Lucene stores > is there for you. > > Erik > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: > lucene-user-help@jakarta.apache.org > __________________________________ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org