lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jose Carlos Canova <jose.carlos.can...@gmail.com>
Subject Re: make data search as index progress.
Date Wed, 16 Apr 2014 02:55:23 GMT
No, the index remains, you can reopen using OpenMode.Append (an enum
somewhere) if there is any exception like power loss you must delete the
lock file(which lock the index stream for other index process), i solved
some issues on performance using a multithread task since the IndexWriter
is thread safe, the problem for indexing (and probably the samples that in
fact i never read ;-) ) is that the task for index is single thread, and
the commit with few segments on large index will cause a performance
decrease. Probably one alternative (that i haven't tested yet) is since the
Index grow you can increase the number of segments allowed for your Index.

Since I don't trust on anybody, i use a Database (Postgres) to manage the
log for indexing, this keeps the task on track to recover from where was
stopped, i haven't finished my pseudo project yet, and have another solid
alternatives like  Hibernate Search which is built on top of lucene, my
problem is that I don't agree with 3rd part frameworks on top of the Lucene
component because they are updating and enhancing the component faster than
the 3rd part companies that uses Lucene, but Hibernate Search and Neo4j are
Industry standards and both use Lucene.




On Tue, Apr 15, 2014 at 9:57 PM, Jason Wee <peichieh@gmail.com> wrote:

> Hello Jose,
>
> Thank you for your insight.
>
> It sounds to me that, before method commit is called, then if there is any
> error happened, example, power failure or human error, then the index will
> be lost?
>
> > (like at each X docs you commit the index and close it)
> iwc.setMaxBufferedDocs(10);
>
> the index speed get very very slow (like 10-20doc per second) unfortunately
> and at times, after index on N files, it just stalled forever, am not sure
> what went wrong.
>
> /Jason
>
>
>
>
>
>
>
> On Mon, Apr 14, 2014 at 9:01 PM, Jose Carlos Canova <
> jose.carlos.canova@gmail.com> wrote:
>
> > Hello,
> >
> > That's because NRTCachingDirectory uses a in cache memory to "mimic in
> > memory the Directory that you used to index your files ", in theory the
> > commit is needed because you need to flush the documents recently added
> > otherwise this document will not be available for search until the end of
> > the indexing when you really need to flush all documents to the index to
> > close properly the "task that you created to index the documents", you
> can
> > adopt other strategies for NRT, one alternative is work with several
> index
> > segments with a fixed document length (like at each X docs you commit the
> > index and close it) using a new instance of a CompositeReader to perform
> > the search, works at same manner, since the CompositeReader as the name
> > says open an IndexReader for a IndexSearcher using list of Indexes.
> >
> > Will work at same manner but with the disadvantage is that you have to
> > create your own code.
> >
> >
> >
> >
> > On Mon, Apr 14, 2014 at 9:29 AM, Jason Wee <peichieh@gmail.com> wrote:
> >
> > > https://lucene.apache.org/core/4_6_0/demo/overview-summary.html
> > >
> > >
> >
> https://lucene.apache.org/core/4_6_0/demo/src-html/org/apache/lucene/demo/IndexFiles.html
> > >
> > > Hello,
> > >
> > > We are using lucene 4.6.0 and storing index on top of cassandra.
> > >
> > > As far as I understand, in order to make the index searchable, in the
> > > IndexFiles, method commit() has to be called, is there any other way so
> > > that the index is searchable other than calling commit() ?
> > >
> > > Took a look on the NRTCachingDirectory,  but our search and index
> > > application exists in two separate jvm, as far as NRT is concern,
> > instance
> > > of NRTCachingDirectory needed to pass in IndexWriter and
> DirectoryReader
> > to
> > > make it searchable.
> > >
> > > Thanks and appreciate any advice.
> > >
> > > /Jason
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message