lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: IndexWriter.deleteDocuments(Query[]) not deleting
Date Sun, 22 Aug 2010 20:47:01 GMT
Did you issue a commit (or close) the IndexWriter after you deleted the
documents?
And I'm assuming that something really weird didn't happen like a case
change,
but your NOT_ANALYZED should take care of that at index time, but are you
sure
your cases match when you submit your term queries?

An interesting test would be to write out the file names you create your
terms
from, and see what happens if you search on those fields etc....

HTH
Erick

On Sun, Aug 22, 2010 at 12:24 PM, Paul J. Lucas <paul@lucasmail.org> wrote:

> Hi -
>
> Using Lucene 2.9.3, I'm indexing the metadata in image files.  For each
> image ("document" in Lucene), I have 2 additional special fields:
> "FILE-PATH" (containing the full path of the file) and "DIR-PATH"
> (containing the full path of the directory the file is in).
>
> The FILE-PATH Field is created only once like:
>
>    private final Field m_fieldFilePath = new Field(
>        "FILE-PATH", "INIT", Field.Store.YES, Field.Index.NOT_ANALYZED
>    );
>
> and reused; the DIR-PATH Field is created once per document like:
>
>    new Field(
>        "DIR-PATH", file.getParentFile().getAbsolutePath(),
>        Field.Store.NO, Field.Index.NOT_ANALYZED
>    )
>
> (The reason the DIR-PATH Field is created once per document is because it's
> part of indexing the rest of the image metadata and isn't a special-case
> like FILE-PATH.  I don't believe this is relevant to the problem at hand,
> however.)
>
> If an image file (or an entire directory of image files) gets deleted, I
> need to delete it (them) from the index.  When deleting a single image, I
> could do:
>
>        Term fileTerm = new Term( "FILE-PATH", file.getAbsolutePath() );
>        writer.deleteDocuments( new TermQuery( fileTerm ) );
>
> When deleting an entire directory of images, I could do:
>
>        Term dirTerm = new Term( "DIR-PATH", file.getAbsolutePath() );
>        writer.deleteDocuments( new TermQuery( dirTerm ) );
>
> However, at the time of deletion, I don't know whether "file" refers to a
> single image file or to a directory of images files.  I can't do
> file.isFile() or file.isDirectory() because "file" no longer exists (it was
> deleted).  So to cover both cases, I do:
>
>        Query[] queries = new Query[]{
>            new TermQuery( fileTerm ),
>            new TermQuery( dirTerm )
>        };
>        writer.deleteDocuments( queries );
>
> I have non-Lucene code that monitors the filesystem for changes.  For Mac
> OS X, I can only get directory-level change notifications.  So if a file is
> deleted from a directory, I get a notification that the directory has
> changed.  So I delete all the documents in that directory then re-add them.
>
> However (and here's the problem), the deletes never happen.  If I delete a
> file from a directory, the directory (looks like) its unindexed and
> reindexed, but a query for that image file still returns a result.  So it's
> like the delete never happened.
>
> Why not?
>
> Additional information: I create/close a new IndexWriter for the delete.
>  Even if I quit the application, relaunch, and run the query, the result
> still shows up (hence it's not that the current reader isn't seeing the
> deletion change).
>
> - Paul
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message