lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Wee <peich...@gmail.com>
Subject Re: make data search as index progress.
Date Fri, 02 May 2014 19:25:22 GMT
Hello Jose,

Well, yes, we are using append but if during indexing, you did not commit
and only when the index is done, that is it close, then the index will
persist. But other than that, during indexing, if commit is not perform,
then during that duration, all index is lost. Yes, we are trying different
settings for the index writer config and merge policy.

Thank for the lengthy information and we have also make our code reachable
via github.com

/Jason




On Wed, Apr 16, 2014 at 10:55 AM, Jose Carlos Canova <
jose.carlos.canova@gmail.com> wrote:

> No, the index remains, you can reopen using OpenMode.Append (an enum
> somewhere) if there is any exception like power loss you must delete the
> lock file(which lock the index stream for other index process), i solved
> some issues on performance using a multithread task since the IndexWriter
> is thread safe, the problem for indexing (and probably the samples that in
> fact i never read ;-) ) is that the task for index is single thread, and
> the commit with few segments on large index will cause a performance
> decrease. Probably one alternative (that i haven't tested yet) is since the
> Index grow you can increase the number of segments allowed for your Index.
>
> Since I don't trust on anybody, i use a Database (Postgres) to manage the
> log for indexing, this keeps the task on track to recover from where was
> stopped, i haven't finished my pseudo project yet, and have another solid
> alternatives like  Hibernate Search which is built on top of lucene, my
> problem is that I don't agree with 3rd part frameworks on top of the Lucene
> component because they are updating and enhancing the component faster than
> the 3rd part companies that uses Lucene, but Hibernate Search and Neo4j are
> Industry standards and both use Lucene.
>
>
>
>
> On Tue, Apr 15, 2014 at 9:57 PM, Jason Wee <peichieh@gmail.com> wrote:
>
> > Hello Jose,
> >
> > Thank you for your insight.
> >
> > It sounds to me that, before method commit is called, then if there is
> any
> > error happened, example, power failure or human error, then the index
> will
> > be lost?
> >
> > > (like at each X docs you commit the index and close it)
> > iwc.setMaxBufferedDocs(10);
> >
> > the index speed get very very slow (like 10-20doc per second)
> unfortunately
> > and at times, after index on N files, it just stalled forever, am not
> sure
> > what went wrong.
> >
> > /Jason
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Apr 14, 2014 at 9:01 PM, Jose Carlos Canova <
> > jose.carlos.canova@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > That's because NRTCachingDirectory uses a in cache memory to "mimic in
> > > memory the Directory that you used to index your files ", in theory the
> > > commit is needed because you need to flush the documents recently added
> > > otherwise this document will not be available for search until the end
> of
> > > the indexing when you really need to flush all documents to the index
> to
> > > close properly the "task that you created to index the documents", you
> > can
> > > adopt other strategies for NRT, one alternative is work with several
> > index
> > > segments with a fixed document length (like at each X docs you commit
> the
> > > index and close it) using a new instance of a CompositeReader to
> perform
> > > the search, works at same manner, since the CompositeReader as the name
> > > says open an IndexReader for a IndexSearcher using list of Indexes.
> > >
> > > Will work at same manner but with the disadvantage is that you have to
> > > create your own code.
> > >
> > >
> > >
> > >
> > > On Mon, Apr 14, 2014 at 9:29 AM, Jason Wee <peichieh@gmail.com> wrote:
> > >
> > > > https://lucene.apache.org/core/4_6_0/demo/overview-summary.html
> > > >
> > > >
> > >
> >
> https://lucene.apache.org/core/4_6_0/demo/src-html/org/apache/lucene/demo/IndexFiles.html
> > > >
> > > > Hello,
> > > >
> > > > We are using lucene 4.6.0 and storing index on top of cassandra.
> > > >
> > > > As far as I understand, in order to make the index searchable, in the
> > > > IndexFiles, method commit() has to be called, is there any other way
> so
> > > > that the index is searchable other than calling commit() ?
> > > >
> > > > Took a look on the NRTCachingDirectory,  but our search and index
> > > > application exists in two separate jvm, as far as NRT is concern,
> > > instance
> > > > of NRTCachingDirectory needed to pass in IndexWriter and
> > DirectoryReader
> > > to
> > > > make it searchable.
> > > >
> > > > Thanks and appreciate any advice.
> > > >
> > > > /Jason
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message