lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: ControlledRealTimeReopenThread
Date Mon, 01 Dec 2014 23:44:05 GMT
Yes that all looks reasonable.  Maybe there is a mismatch in the 
analysis chain?  I'm just throwing out wild guesses because I don't 
really see any problems in what you shared.  Also - if the problem 
really has something to do with ControlledRealTimeReopenThread, I'm not 
going to have the answer, so I apologize but I think I need to bow out.

-Mike


On 12/1/2014 6:22 PM, Badano Andrea wrote:
> Thanks for your reply!
>
> I try to delete documents using a term that matches a Document TextField:
>
>    private static final String NAME = "name";
>
>    private void store(String n, ... other fields ...) {
>      Document d = new Document();
>      d.add(new TextField(NAME, n, Field.Store.YES));
>      ... add other fields ...
>      _iw.addDocument(d);
>    }
>
>    private void remove(String n) {
>      Term t = new Term(NAME, n);
>      _iw.deleteDocuments(t);
>    }
>
> Is it possible to remove a document in this manner? Create a Term object based on a document
field of type TextField?
>
> I never close() any of the documents created in my wrapper.
> All add/update/deletes go via the TrackingIndexWriter, while all commits are called on
the underlying IndexWriter.
>
> Regards,
>
> Andrea
>
>
>
>
>
>
> On 1 Dec 2014, at 23:23, Michael Sokolov <msokolov@safaribooksonline.com> wrote:
>
> It's impossible to tell since you didn't include the code for it, but my advice would
be to look at how the documents are being marked for deletion.  What are the terms being used
to delete them?  Are you trying to use lucene docids?
>
> -Mike
>
> On 12/1/2014 4:22 PM, Badano Andrea wrote:
>> Hello,
>>
>> My apologies for a longish question.
>>
>> I am having some problems with a class that tries to ensure that a lucene index is
>> always kept up-to-date with the contents of a mysql master database. Users add,
>> modify, and delete items in the master database, and all changes to the master
>> database are immediately propagated to the index. When the application starts up,
>> all items present in the master database that are not present in the index are
>> added to the index. Similarly, all items present in the index that are not present
>> in the master database are removed from the index.
>>
>> I am trying to do this with code based on http://stackoverflow.com/questions/17993960/lucene-4-4-0-new-controlledrealtimereopenthread-sample-usage.
>> Automatically copying data from the master database to the index seems to work.
>> However, removing items from the index not present in the database does not seem
to work.
>>
>> So I have this class:
>>
>> class IndexWrapper {
>>    private final IndexWriter _iw;
>>    private final TrackingIndexWriter _triw;
>>    private final ReferenceManager<IndexSearcher> _rmgr;
>>    private final ControlledRealTimeReopenThread<IndexSearcher> _reopen;
>>    private final Analyzer _analyzer;
>>    private AtomicLong _gen;
>>    ...
>> }
>>
>> that is set up as follows:
>>
>> _iw = new IndexWriter(directory, new IndexWriterConfig(Version.LUCENE_4_10_2, analyzer));
>> _triw = new TrackingIndexWriter(_iw);
>> _rmgr = new SearcherManager(_iw, true, null);
>> _reopen = new ControlledRealTimeReopenThread<IndexSearcher>(_triw,_rmgr, 60.00,
0.1);
>> _analyzer = analyzer;
>> _gen = new AtomicLong(_triw.getGeneration());
>> _reopen.start();
>>
>> First some code that fetches every doc in the index is called:
>>
>> _reopen.waitForGeneration(_gen.get()); // wait until the index is re-opened for the
last update
>> IndexSearcher searcher = _rmgr.acquire();
>> try {
>>    ... fetch all documents in index ...
>> }
>> finally {
>>    _rmgr.release(searcher);
>> }
>>
>> This returns all docs in the index. Later on, there is an attempt to remove some
of these documents
>> (the ones that no longer exist in the master database):
>>
>> long curr = _gene.get();
>> _gen.compareAndSet(curr, _triw.deleteDocuments(termToRemove));
>> _iw.commit();
>>
>> This code runs without any exceptions being thrown, but it does not seem to remove
anything.
>> If I enable logging, I see things such as:
>>
>> DW : anyChanges? numDocsInRam=0 deletes=false hasTickets:false pendingChangesInFullFlush:
false
>>
>> Supposedly the printout
>>
>> numDocsInRam=0
>>
>> means that commit() has not found any documents to delete. Also, if I add some extra
logging to IndexWriter.deleteDocuments() like so:
>>
>> public void deleteDocuments(Term... terms) throws IOException {
>>    ensureOpen();
>>    try {
>>      boolean dt = docWriter.deleteTerms(terms);
>>      System.err.printf("DELETING TERMS : %s\n", terms);
>>      System.err.printf("DT : %s\n", dt);
>>      if (dt) {
>>        processEvents(true, false);
>>      }
>>    } catch (OutOfMemoryError oom) {
>>      tragicEvent(oom, "deleteDocuments(Term..)");
>>    }
>> }
>>
>> I can see printouts :
>>
>> DT : false
>>
>> So, an IndexWriter is given to a ReferenceManager which is then used to create an
IndexSearcher
>> that returns a set of documents. Yet later, when an attempt is made to remove some
of these
>> documents, the IndexWriter (or rather, its docWriter), cannot find these documents.
Assuming
>> that the IndexWriter is somehow involved in the inital fetch of all documents, I
am confused how
>> the IndexWriter a short while later cannot find some of these documents that have
been marked
>> (by my application) for deletion. I am pretty sure that the Term objects that are
passed into
>> deleteDocuments() are compatible with the documents previously returned by the IndexSearcher.
>> So have I misunderstood the role of the IndexWriter as some kind of central gateway
to all documents?
>>
>> Andrea
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message