lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 30330] - IndexReader.delete(term) does not delete last doc from TermDocs list
Date Fri, 30 Jul 2004 20:59:29 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=30330>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=30330

IndexReader.delete(term) does not delete last doc from TermDocs list





------- Additional Comments From alan@collison.net  2004-07-30 20:59 -------
It is quite possible that there is a problem with how I am adding and then deleting a document
from the
index which is the source of the problem, although what I am doing is only a small variation
on the 
Demo code.  I'm not a Java programmer and have only just recently started playing with Lucene.

I've attached two class files which are small modifications to the IndexFiles.java and DeleteFiles.java

files included in the lucene distro.   I've also included the jar file that includes them
(although I received
an error when uploading and am not positive it got uploaded safely).  I'm afraid
I don't have a publicly accessible server where you can see directly what I have, but here's
a complete
rundown of what I did:

- created and archive of plain text/html documents (in this case in the directory /home/acolliso/
archive)

- ran java org.apache.lucene.demo.IndexFiles /home/acolliso/archive
   This created an index of about 190 MB and searching on the index appears to work fine.

- then I ran the following:

> java org.apache.lucene.demo.IndexFile /home/acolliso/archive/newdoc1.html
adding /home/acolliso/archive/newdoc1.html
244 total milliseconds

  newdoc1.html is a small file with a few unique words in it so I know it will show up at
the top
  of my search later on.  Note that IndexFile does *not* run optimize.  don't know if that
has any
  consequences.

> java org.apache.lucene.demo.SearchFiles
Query:  +white +dwarfs +black +holes
Searching for: +white +dwarfs +black +holes
2 total matching documents
0. /home/acolliso/archive/newdoc.html
1. /home/acolliso/archive/newdoc1.html

> java org.apache.lucene.demo.DeleteFile /home/acolliso/archive/newdoc1.html
Path to delete: /home/acolliso/archive/newdoc1.html
Term: path:/home/acolliso/archive/newdoc1.html
Got next doc ...false
Term docs: 164437
deleted 0 documents containing path:/home/acolliso/archive/newdoc1.html

The 'path' *seems* to be o.k., but next() returns false and the doc is not deleted,
as the following search confirms:

> java org.apache.lucene.demo.SearchFiles
Query: +white +dwarfs +black +holes
Searching for: +white +dwarfs +black +holes
2 total matching documents
0. /home/acolliso/archive/newdoc.html
1. /home/acolliso/archive/newdoc1.html

The I modified the delete() code as noted in the original report
and tried delete() again:

> java org.apache.lucene.demo.DeleteFile /home/acolliso/archive/newdoc1.html
Path to delete: /home/acolliso/archive/newdoc1.html
Term: path:/home/acolliso/archive/newdoc1.html
Got next doc ...false
Term docs: 164437
deleted 1 documents containing path:/home/acolliso/archive/newdoc1.html

And to see if the document really was deleted, I did the search again:

> java org.apache.lucene.demo.SearchFiles
Query: +white +dwarfs +black +holes
Searching for: +white +dwarfs +black +holes
1 total matching documents
0. /home/acolliso/archive/newdoc.html

Which makes it seem that the doc was successfully deleted.

If there is any other info I can provide that would help, or could provide
in a different format, I'm happy to provide it.  Or, if this really ought to
go to the distro list instead, I'm happy to do that as well.  Thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message