Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 43890 invoked from network); 30 Nov 2003 16:24:14 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 30 Nov 2003 16:24:14 -0000 Received: (qmail 97495 invoked by uid 500); 30 Nov 2003 16:24:04 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 97469 invoked by uid 500); 30 Nov 2003 16:24:04 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 97455 invoked from network); 30 Nov 2003 16:24:03 -0000 Received: from unknown (HELO c000.snv.cp.net) (209.228.32.64) by daedalus.apache.org with SMTP; 30 Nov 2003 16:24:03 -0000 Received: (cpmta 17524 invoked from network); 30 Nov 2003 08:24:06 -0800 Received: from 128.143.184.66 (HELO ehatchersolutions.com) by smtp.hatcher.net (209.228.32.64) with SMTP; 30 Nov 2003 08:24:06 -0800 X-Sent: 30 Nov 2003 16:24:06 GMT Date: Sun, 30 Nov 2003 11:24:09 -0500 Subject: Re: raw hit count Content-Type: text/plain; charset=US-ASCII; format=flowed Mime-Version: 1.0 (Apple Message framework v553) From: Erik Hatcher To: "Lucene Users List" Content-Transfer-Encoding: 7bit In-Reply-To: <20031130161356.83379.qmail@web11507.mail.yahoo.com> Message-Id: X-Mailer: Apple Mail (2.553) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Sunday, November 30, 2003, at 11:13 AM, Kent Gibson wrote: > as per Erik's idea I tried with the BitSet as follows: > > QueryFilter qf = new QueryFilter(query); > IndexReader ir = IndexReader.open(indexPath); > Searcher searcher2 = new IndexSearcher(ir); > > // get the bit set for the query > BitSet bits = qf.bits(ir); I did not mean to imply for you to call the bits method in this manner. In fact, you should not call it - the IndexSearcher calls it under the covers. I was implying that you could write your own Filter subclass that lit up a single-bit corresponding to the document you're interested in. > However I always get a result of 1, which I suppose is > has to do with this overlap thingy. No, not related with respect to a filter - two different concepts. > Is there not a simple way to just get some word > statistics out of a file? Look at the Lucene index format (from Lucene's main web page). Term frequencies are part of the statistics gathered, of course. You can get at the values there using IndexReader. This may be a lot lower-level than you desire, but what Lucene stores is there for you. Erik --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org