lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Badano Andrea <Andrea.Bad...@sweco.se>
Subject Re: ControlledRealTimeReopenThread
Date Mon, 01 Dec 2014 23:22:42 GMT
Thanks for your reply!

I try to delete documents using a term that matches a Document TextField:

  private static final String NAME = "name";

  private void store(String n, ... other fields ...) {
    Document d = new Document();
    d.add(new TextField(NAME, n, Field.Store.YES));
    ... add other fields ...
    _iw.addDocument(d);
  }

  private void remove(String n) {
    Term t = new Term(NAME, n);
    _iw.deleteDocuments(t);
  }

Is it possible to remove a document in this manner? Create a Term object based on a document
field of type TextField?

I never close() any of the documents created in my wrapper.
All add/update/deletes go via the TrackingIndexWriter, while all commits are called on the
underlying IndexWriter.

Regards,

Andrea






On 1 Dec 2014, at 23:23, Michael Sokolov <msokolov@safaribooksonline.com> wrote:

It's impossible to tell since you didn't include the code for it, but my advice would be to
look at how the documents are being marked for deletion.  What are the terms being used to
delete them?  Are you trying to use lucene docids?

-Mike

On 12/1/2014 4:22 PM, Badano Andrea wrote:
> Hello,
> 
> My apologies for a longish question.
> 
> I am having some problems with a class that tries to ensure that a lucene index is
> always kept up-to-date with the contents of a mysql master database. Users add,
> modify, and delete items in the master database, and all changes to the master
> database are immediately propagated to the index. When the application starts up,
> all items present in the master database that are not present in the index are
> added to the index. Similarly, all items present in the index that are not present
> in the master database are removed from the index.
> 
> I am trying to do this with code based on http://stackoverflow.com/questions/17993960/lucene-4-4-0-new-controlledrealtimereopenthread-sample-usage.
> Automatically copying data from the master database to the index seems to work.
> However, removing items from the index not present in the database does not seem to work.
> 
> So I have this class:
> 
> class IndexWrapper {
>   private final IndexWriter _iw;
>   private final TrackingIndexWriter _triw;
>   private final ReferenceManager<IndexSearcher> _rmgr;
>   private final ControlledRealTimeReopenThread<IndexSearcher> _reopen;
>   private final Analyzer _analyzer;
>   private AtomicLong _gen;
>   ...
> }
> 
> that is set up as follows:
> 
> _iw = new IndexWriter(directory, new IndexWriterConfig(Version.LUCENE_4_10_2, analyzer));
> _triw = new TrackingIndexWriter(_iw);
> _rmgr = new SearcherManager(_iw, true, null);
> _reopen = new ControlledRealTimeReopenThread<IndexSearcher>(_triw,_rmgr, 60.00,
0.1);
> _analyzer = analyzer;
> _gen = new AtomicLong(_triw.getGeneration());
> _reopen.start();
> 
> First some code that fetches every doc in the index is called:
> 
> _reopen.waitForGeneration(_gen.get()); // wait until the index is re-opened for the last
update
> IndexSearcher searcher = _rmgr.acquire();
> try {
>   ... fetch all documents in index ...
> }
> finally {
>   _rmgr.release(searcher);
> }
> 
> This returns all docs in the index. Later on, there is an attempt to remove some of these
documents
> (the ones that no longer exist in the master database):
> 
> long curr = _gene.get();
> _gen.compareAndSet(curr, _triw.deleteDocuments(termToRemove));
> _iw.commit();
> 
> This code runs without any exceptions being thrown, but it does not seem to remove anything.
> If I enable logging, I see things such as:
> 
> DW : anyChanges? numDocsInRam=0 deletes=false hasTickets:false pendingChangesInFullFlush:
false
> 
> Supposedly the printout
> 
> numDocsInRam=0
> 
> means that commit() has not found any documents to delete. Also, if I add some extra
logging to IndexWriter.deleteDocuments() like so:
> 
> public void deleteDocuments(Term... terms) throws IOException {
>   ensureOpen();
>   try {
>     boolean dt = docWriter.deleteTerms(terms);
>     System.err.printf("DELETING TERMS : %s\n", terms);
>     System.err.printf("DT : %s\n", dt);
>     if (dt) {
>       processEvents(true, false);
>     }
>   } catch (OutOfMemoryError oom) {
>     tragicEvent(oom, "deleteDocuments(Term..)");
>   }
> }
> 
> I can see printouts :
> 
> DT : false
> 
> So, an IndexWriter is given to a ReferenceManager which is then used to create an IndexSearcher
> that returns a set of documents. Yet later, when an attempt is made to remove some of
these
> documents, the IndexWriter (or rather, its docWriter), cannot find these documents. Assuming
> that the IndexWriter is somehow involved in the inital fetch of all documents, I am confused
how
> the IndexWriter a short while later cannot find some of these documents that have been
marked
> (by my application) for deletion. I am pretty sure that the Term objects that are passed
into
> deleteDocuments() are compatible with the documents previously returned by the IndexSearcher.
> So have I misunderstood the role of the IndexWriter as some kind of central gateway to
all documents?
> 
> Andrea
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message