lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)" <nageshbl...@gmail.com>
Subject Re: Deleted documents in the index.
Date Fri, 25 Jul 2008 14:09:13 GMT
Hi Michael,
The numDocs did come from IndexReader.numDocs().

hmm...let me try with maxDoc.

Nagesh

On 7/25/08, Michael McCandless <lucene@mikemccandless.com> wrote:
>
> Oh, I think I see the problem -- instead of numDocs in your for loop
> (which I assume came from IndexReader.numDocs()) change that to maxDoc
> (IndexReader.maxDoc()).
>
> Mike
>
> (Nagesh S) wrote:
>
>> Hi Michael,
>> Thanks for your response. Yes, I got that.
>>
>> I guess, my question is, how do I access the newly added document ? In
>> other words, if the index initially had 20 docs of which 10 were
>> updated (that is, deleted and then added), how do I access the updated
>> ones ?
>>
>> Initially, there was no check for delete - that is, I did not have
>> IndexReader.isDeleted(int). It had the for loop only which would fail
>> when obtaining a 'deleted' document with the following :
>>
>> java.lang.IllegalArgumentException: attempt to access a deleted
>> document
>> at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:
>> 331)
>> at org.apache.lucene.index.MultiReader.document(MultiReader.java:108)
>> at org.apache.lucene.index.IndexReader.document(IndexReader.java:437)
>> etc.
>>
>> Regards,
>> Nagesh
>>
>> On 7/25/08, Michael McCandless <lucene@mikemccandless.com> wrote:
>>>
>>> When you call updateDocument, the old  document is deleted but a
>>> wholly new document is added.  So the "else" clause in your loop
>>> below
>>> will report on the newly added documents (you won't miss any).
>>>
>>> Mike
>>>
>>> (Nagesh S) wrote:
>>>
>>>> Hi,
>>>> I think, the earlier mail didn't make it through.
>>>>
>>>> I am writing a class to report on an index. This index has documents
>>>> updated using the IndexWriter.updateDocument(Term, Document) method.
>>>> That is, documents were deleted and added again. My aim is to see
>>>> what
>>>> documents (and their fields) are present in the index. Since the
>>>> document was updated (i.e. deleted and added), it is marked as
>>>> deleted
>>>> and hence not able to obtain a Document object for the updated
>>>> document.
>>>>
>>>> How do I report on such documents ?
>>>>
>>>> for (int i = 1; i < numDocs; i++) {
>>>> //ir is an IndexReader object
>>>>           if (ir.isDeleted(i)) {
>>>>               bw.write("Document " + i + " has been deleted.");
>>>>               bw.newLine();
>>>>           } else {
>>>>               Document d = getDocument(ir, i);
>>>>
>>>>               List<Field> l = d.getFields();
>>>>               int numFields = l.size();
>>>>               bw.write("Document has " + numFields + " fields as
>>>> follows");
>>>>               bw.newLine();
>>>>
>>>>               for (int j = 0; j < numFields; j++) {
>>>>                   String fieldName = l.get(j).name();
>>>>                   bw.write("\t Field : " + fieldName + " Value : "
>>>>                           + d.getField(fieldName).stringValue());
>>>>                   bw.newLine();
>>>>               }
>>>>           }
>>>>       }
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message