lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Baskakov Daniel <gda...@gmail.com>
Subject Re: Synchronous Lucene index update tests occasionally fail
Date Mon, 27 Jun 2016 09:07:11 GMT
I've just noticed that not only dynamic adding/removing entities tests
fail, but also a simple indexing.

Here is a boiled down structure of the test:

  @BeforeClass
  public static void beforeClass() throws Exception
  {
    // ContextManager is a domain model
    contextManager = createContextManager();

    searcher = new
ServerSearcher(Collections.singletonList(indexingSettings), false);

    searcher.openRamDirectory();

    // Context is a domain model item, it has variables. One of the
contexts has "variableWithHelpString" variable.
    // searcher.createContextDocument(aContext) creates a Lucene document
with a field: new TextField(field, value, Field.Store.YES)
    for (Context aContext : contextManager)
    {
      Document doc = searcher.createContextDocument(aContext);

      final Term idTerm = new Term(ID_FIELD, doc.get(ID_FIELD));
      // using updateDocument here because the domain model is dynamic
      searcher.getIndexWriter().updateDocument(idTerm, doc);
    }
 }

  @Test
  public void testVariableName() throws Exception
  {
    searcher.commitNow();

    String text = "variableWithHelpString";

    final MultiFieldQueryParser queryParser = new
MultiFieldQueryParser(ServerSearcher.ALL_SEARCH_FIELDS, searcher.analyzer);
    queryParser.setDefaultOperator(QueryParser.Operator.AND);

    text = text.replaceAll("(\\w+)", "$1\\*");

    final Query query = queryParser.parse(text);

    final IndexSearcher searcher = searcher.acquireIndexSearcher();

    TopScoreDocCollector docCollector = TopScoreDocCollector.create(5000);

    searcher.search(query, docCollector);

    final ScoreDoc[] scoreDocs = docCollector.topDocs().scoreDocs;

    assertThat(scoreDocs.length(), is(1));
  }

Also there is a comprehensive logging during the test invocation. And it
can be seen that document for the context with 'variableWithHelpString' is
properly created and added to IW:
12:34:23,499 DEBUG ag.context.search         Variable Definition document
added to index:
Document<stored,indexed,indexOptions=DOCS<id:rootcontext:variableWithHelpString>
stored,indexed,tokenized,omitNorms,indexOptions=DOCS,numericType=INT,numericPrecisionStep=8<docType:1>
stored,indexed,indexOptions=DOCS<contextPath:rootcontext>
stored,indexed,tokenized<name:variableWithHelpString>
stored,indexed,tokenized<description:variableWithHelpString>
stored,indexed,tokenized<value:Two Beer Or Not Two Beer?, , >
stored,indexed,tokenized<fields:Dummy Field (dummyField)>
stored,indexed,tokenized<fields:Variable Field (variableField) [Variable
Field Help]> stored,indexed,tokenized<fields:Dummy Field (dummyField1)>>
  -     -     -     -     -     -     -     -     -     -     [main]

Here is the later log output for search operation that returns no document:
12:34:23,796 DEBUG ag.context.search         Search
'name:variablewithhelpstring* type:variablewithhelpstring*
description:variablewithhelpstring* help:variablewithhelpstring*
fields:variablewithhelpstring* outputFields:variablewithhelpstring*
value:variablewithhelpstring*' took '0' seconds and returned 0 hits     -
  -     -     -     -     -     -     -     -     -     [main]

Daniel.

пн, 27 июн. 2016 г. в 11:12, Michael McCandless <lucene@mikemccandless.com>:

> Can you boil this down to a small standalone test case showing the issue?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Mon, Jun 27, 2016 at 4:03 AM, Baskakov Daniel <gdaniq@gmail.com> wrote:
>
>> Thank you Mike.
>>
>> Commit is performed after each indexing op in unit tests only:
>>
>>   public void commitNow() throws IOException
>>   {
>>     if (getIndexWriter().hasUncommittedChanges())
>>     {
>>       getIndexWriter().commit();
>>     }
>>   }
>>
>> In production environment I have a timer that performs commit periodically
>> if required.
>>
>> I do reopen near-real-time IR before every test search (thanks to your
>> blog!):
>>
>>   private IndexSearcher acquireIndexSearcher() throws IOException
>>   {
>>     if (searcherManager == null)
>>     {
>>       searcherManager = new SearcherManager(getIndexWriter(), true, null);
>>     }
>>     searcherManager.maybeRefreshBlocking();
>>     return searcherManager.acquire();
>>   }
>>
>> But the problem is still there.
>>
>> Daniel.
>>
>> чт, 23 июн. 2016 г. в 17:19, Michael McCandless <
>> lucene@mikemccandless.com>:
>>
>> > You must reopen your IndexReader to see recent changes to the index.
>> >
>> > But, IW.commit after each indexing op is very costly.
>> >
>> > It's much better to get near-real-time readers, e.g. from a
>> > SearcherManager that you pass your IW instance too, after each set of
>> > changes that you now need to search.
>> >
>> > As long as you call SearcherManager.maybeRefreshBlocking after changes
>> to
>> > the IW, the resulting reopened reader will reflect your index changes.
>> >
>> > Mike McCandless
>> >
>> > http://blog.mikemccandless.com
>> >
>> > On Thu, Jun 23, 2016 at 7:47 AM, Baskakov Daniel <gdaniq@gmail.com>
>> wrote:
>> >
>> >> Originally i've posted the question at stackoverflow.com but without
>> any
>> >> reply. So I hope someone can help me in the official list.
>> >>
>> >> I'm testing that dynamic changes of the domain model reflects at the
>> >> Lucene
>> >> index. Special event listeners (synchronous, no multithreading here)
>> are
>> >> executed when the domain model components change. Listeners update the
>> >> Lucene index:
>> >>
>> >> Document doc = createDocumentForComponent(domainModelComponent);
>> >> indexWriter.updateDocument(docTerm, doc);
>> >> indexWriter.commit();
>> >>
>> >> Then I perform searching by a query that contains recently added
>> changes.
>> >> Most of the time tests work perfect, but sometimes they fail
>> (especially
>> >> in
>> >> automated builds).
>> >>
>> >> I've tried to acquire an IndexSearcher by different ways: create a new
>> >> searcher on the same Directory or obtain it via SearcherManager.
>> >>
>> >> Is there a way to made recent index changes available to index searcher
>> >> with 100% confidence?
>> >>
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message