lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Recreating index lucene without stopping client applications
Date Wed, 18 Jul 2018 09:50:18 GMT
If you use IndexWriter.deleteAll, and not any of the other delete by Query,
Term methods, it should be quite efficient to delete, as IndexWriter just
drops all segments.

That API is also transactional, so you could call IW.deleteAll, proceed to
reindex all your documents, and if somehow that crashes before finishing,
your index will still reflect the old index with nothing deleted or
updated.  Only once you successfully commit will the new index become
visible to maybeRefresh() calls on a non-NRT reader.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Jul 17, 2018 at 5:15 PM, Michael Sokolov <msokolov@gmail.com> wrote:

> If you create a completely new index, rather than applying updates to an
> existing index, you will not be able to see that by calling maybeRefresh(),
> I think, since that is looking for updates to an existing index.
> Conceivably you could open a writer on the existing index, delete all of
> its documents, and then write new ones and commit. After that, your refresh
> call would see the updates. But I wouldn't recommend this since it might be
> inefficient to do all those deletions. Instead I would suggest creating a
> new index directory, and having some process that watches for a new
> directory being created. Then when it sees that, it could open a new
> searcher using that directory, and replace your existing searcher. In other
> words, implement the refresh yourself, since you have taken over the
> process of writing new indexes outside of what Lucene manages. Another
> possibility would be to maintain a timestamp on your documents, write all
> your new documents, and then query-and-delete any documents with old
> timestamps.  But the key point here is that you can't just create a new
> index and expect your reader to know about it just because you stuck it in
> the same file system directory where the old one was.
>
> On Wed, Jul 11, 2018 at 11:46 AM Eduardo Costa Lopes <
> eduardo-costa.lopes@serpro.gov.br> wrote:
>
> > Hi Marco,
> >
> > Basically, the content of lucene index directory is deleted and after,
> the
> > index is recreated (under the same directory). Months ago, I've
> researched
> > how to "refresh" the lucene access to get the newest data withou
> restarting
> > the wep applications and, in the 6.1.0 version, it is available the class
> > SearchManager, which according to the documentation, should be called its
> > method maybeRefresh() periodically to reopen the index. Our "reopen
> > scheduler" runs hourly and even being executed with success it seems the
> > data wasn't the newest.
> >
> > Thanks.
> >
> >
> > ==================================
> > Eduardo Costa Lopes
> > SERPRO - SUPDE/DEPAE/DE009
> >
> > e-mail: eduardo-costa.lopes@serpro.gov.br
> > telefone: (51) 2129 - 1180
> >
> > ----- Mensagem original -----
> > De: "Marco Reis" <marco.antonio.sousa.reis@gmail.com>
> > Para: "java-user" <java-user@lucene.apache.org>
> > Enviadas: Quarta-feira, 11 de julho de 2018 12:06:18
> > Assunto: Re: Recreating index lucene without stopping client applications
> >
> > Hi Eduardo,
> >
> > It's not clear the index recreation process, but I think you have two
> > different SearcherManagers, one for the app and a different one for the
> > command line. At some point, one of them could see the document
> exclusion,
> > and the JBoss doen't. Maybe reopen the index directory could help.
> >
> >
> >
> >
> > On Wed, Jul 11, 2018 at 11:46 AM Eduardo Costa Lopes <
> > eduardo-costa.lopes@serpro.gov.br> wrote:
> >
> > > Hello,
> > >
> > > I have a Jboss application querying a lucene index to get some customer
> > > info. Sometimes the index are recreated while the application is
> running.
> > > Basically, the old index is erased and a new one is created. In the
> > > application side we have a scheduler calling
> > > org.apache.lucene.search.SearcherManager..maybeRefresh(), in order to
> > get a
> > > new connection to the index. The issue is: today we have updated the
> > index
> > > and looking for a certain name our command-line returns 4955 hits, but
> in
> > > the web app we got 4058 hits (three more). The correct hit number only
> is
> > > show if restart the jboss. I'd like to know how can we recreate the
> > lucene
> > > index without need to restart the applications.
> > >
> > > Thanks in advance,
> > >
> > > Eduardo Lopes.
> > >
> > >
> > >
> > >
> > > -
> > >
> > >
> > > "Esta mensagem do SERVIÇO FEDERAL DE PROCESSAMENTO DE DADOS (SERPRO),
> > > empresa pública federal regida pelo disposto na Lei Federal nº 5.615, é
> > > enviada exclusivamente a seu destinatário e pode conter informações
> > > confidenciais, protegidas por sigilo profissional. Sua utilização
> > > desautorizada é ilegal e sujeita o infrator às penas da lei. Se você a
> > > recebeu indevidamente, queira, por gentileza, reenviá-la ao emitente,
> > > esclarecendo o equívoco."
> > >
> > > "This message from SERVIÇO FEDERAL DE PROCESSAMENTO DE DADOS (SERPRO)
> --
> > a
> > > government company established under Brazilian law (5.615/70) -- is
> > > directed exclusively to its addressee and may contain confidential
> data,
> > > protected under professional secrecy rules. Its unauthorized use is
> > illegal
> > > and may subject the transgressor to the law's penalties. If you're not
> > the
> > > addressee, please send it back, elucidating the failure."
> > >
> > --
> > Marco Reis
> > Software Engineer
> > http://marcoreis.net
> > https://github.com/masreis
> > +55 61 9 81194620
> >
> > -
> >
> >
> > "Esta mensagem do SERVIÇO FEDERAL DE PROCESSAMENTO DE DADOS (SERPRO),
> > empresa pública federal regida pelo disposto na Lei Federal nº 5.615, é
> > enviada exclusivamente a seu destinatário e pode conter informações
> > confidenciais, protegidas por sigilo profissional. Sua utilização
> > desautorizada é ilegal e sujeita o infrator às penas da lei. Se você a
> > recebeu indevidamente, queira, por gentileza, reenviá-la ao emitente,
> > esclarecendo o equívoco."
> >
> > "This message from SERVIÇO FEDERAL DE PROCESSAMENTO DE DADOS (SERPRO) --
> a
> > government company established under Brazilian law (5.615/70) -- is
> > directed exclusively to its addressee and may contain confidential data,
> > protected under professional secrecy rules. Its unauthorized use is
> illegal
> > and may subject the transgressor to the law's penalties. If you're not
> the
> > addressee, please send it back, elucidating the failure."
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message