lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dr. Lothar Simon" <lothar.si...@eidon.de>
Subject RE: How to count number of entries
Date Tue, 15 Oct 2002 12:40:07 GMT
Thanks a lot. I tried the code below, which works, but obviously requires
prior optimization (because of deleted entries which still count) to get
reliable results. This is the best solution so far. If someone has an idea
how to avoid optimization, please let me know.

		IndexReader reader = IndexReader.open(this.getIndexWorkdir());
		Term term = new Term(FULLTEXT_KEY_FIELD, key);
		count = reader.docFreq(term);

Lothar


-----Original Message-----
From: Ype Kingma [mailto:ykingma@xs4all.nl]
Sent: Tuesday, October 15, 2002 9:50 AM
To: Lucene Users List
Subject: Re: How to count number of entries


On Tuesday 15 October 2002 08:50, you wrote:
> I want to write a function countIndexEntries(key) to find out how many
> entries are there in the index database for a key. I read the faq entry
> about counting number of hits, but somehow it doesnt work as expected,
> please help:
>
> I create entries with
> 		indexDocument.add(Field.Keyword(FULLTEXT_KEY_FIELD,key));
> 		indexDocument.add(Field.Text(FULLTEXT_CONTENTS_FIELD,reader));
> 		indexwriter.addDocument(indexDocument);
>
> I try to count the number of entries for a key with
> 		Query query = QueryParser.parse(key, FULLTEXT_KEY_FIELD, analyzer);
> 		count = searcher.search(query).length();
>
> to find out if something is not yet indexed (count=0), already indexed
> (count=1), or index is inconsistent (count=2). But count is always = 0 :-(

This might depend on your analyzer.
For key fields, I prefer to use a lower level API on IndexReader:
construct a Term from the field name and the key value and use the
IndexReader method that returns an iterator over the docs containing the
term, termdocs() iirc. Beware of correct case for the key value.
In case you have mutliple key values to check, sorting them before going to
the  IndexReader helps performance
For pure boolean work, the IndexReader is quite useful.

> What is wrong or how to acchieve my goal with a different approach? Help
is
> appreciated.

It should work I think. You might try and print the output of the analyzer,
ie the actually indexed text, when constructing the document.

Have fun,
Ype

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message