lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brisbart Franck <Franck.Brisb...@kelkoo.net>
Subject Re: query matching all documents
Date Thu, 22 May 2003 14:36:09 GMT
You're right.
When you delete a document, the document is marked as 'deleted'. And the 
documents numbers are still the same until an optimize is done.

So, after deleting documents, if you want to list them:
- either you do a loop from 0 to maxDoc() and you treat the deleted docs 
(with the same IndexReader)
- or you do an 'optimize' and with a brand new IndexReader you do your 
loop from 0 to numDocs() (without any deleted docs to treat).

franck

Guilherme Barile wrote:
> What I didn't figure out is, if I have some index like:
> [0] doc1.txt
> [1] doc2.doc
> [2] doc3.xls
> [3] doc4.nfo
> [4] doc4.pdf
> 
> and then I delete doc2.doc (document #1 in lucene). Will the other
> documents numbers change ? or there will be a gap in my index ?
> Let's list it again with doc2.doc deleted (supposing the gap will be
> there)
> 
> [0] doc1.txt
> [1] << DELETED >>
> [2] doc3.xls
> [3] doc4.nfo
> [4] doc4.pdf
> 
> this way (i think) numDocs() will return 4, but maxDocs() would return
> 5. Using numDocs() would make me lose a document, at least in the way I
> implemented it. Any tips ?
> 
> gui
> 
> 
> On Thu, 2003-05-22 at 10:32, Brisbart Franck wrote:
> 
>>You don't really need to take care of the deleted docs. When you'll try 
>>to get a deleted doc (reader.document(i) on a deleted doc), a 
>>IllegalArgumentException will thrown with the message 'attempt to access 
>>a deleted document'. Just catch this exception.
>>
>>Also, I suggest you to use 'numDocs()' instead of 'maxDoc()' to get the 
>>real number of documnets in the index.
>>
>>Franck
>>
>>Guilherme Barile wrote:
>>
>>>As I said, I'm still getting started (didn't implement deleting
>>>documents yet). Any tips on checking this ?
>>>
>>>On Thu, 2003-05-22 at 03:31, Morus Walter wrote:
>>>
>>>
>>>>Guilherme Barile writes:
>>>>
>>>>
>>>>>If you're trying to get all documents, why not
>>>>>
>>>>>IndexReader reader = IndexReader.open(this.indexDir);
>>>>>Document doc;
>>>>>	
>>>>>for (int i = 0; i < reader.maxDoc(); i++) {
>>>>>	try {
>>>>>		doc = reader.document(i);
>>>>>		System.out.println(i + " " + doc.get("source"));
>>>>>	}
>>>>>	catch (Exception e) {
>>>>>		System.out.println("Error getting doc " + i);
>>>>>	}
>>>>>}
>>>>>
>>>>
>>>>I guess there should be some extra check to take care of deleted
>>>>documents, that aren't removed from the index yet.
>>>>
>>>>greetings
>>>>	Morus


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message