lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng <zhoucheng2...@gmail.com>
Subject Re: Configure writer to write to FSDirectory?
Date Mon, 06 Feb 2012 16:54:46 GMT
Will do.

On Tue, Feb 7, 2012 at 12:52 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> You tell NRTCachingDirectory how much RAM it's allowed to use, and it
> then caches newly flushed segments in a private RAMDirectory.
>
> But you should first test performance w/o it (after removing the
> commit calls).  NRT is very fast...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Mon, Feb 6, 2012 at 11:46 AM, Cheng <zhoucheng2008@gmail.com> wrote:
> > Good point. I should remove the commits.
> >
> > Any difference between NRTCashingDirectory and RAMDirectory? how to
> define
> > the "small"?
> >
> > On Tue, Feb 7, 2012 at 12:42 AM, Michael McCandless <
> > lucene@mikemccandless.com> wrote:
> >
> >> You shouldn't call IW.commit when using NRT; that's the point of NRT
> >> (making changes visible w/o calling commit).
> >>
> >> Only call commit when you require that all changes be durable (surive
> >> OS / JVM crash, power loss, etc.) on disk.
> >>
> >> Also, you can use NRTCachingDirectory which acts like RAMDirectory for
> >> small flushed segments.
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >> On Mon, Feb 6, 2012 at 10:45 AM, Cheng <zhoucheng2008@gmail.com> wrote:
> >> > Uwe, when I meant speed is slow, I didn't refer to instant visibility
> of
> >> > changes, but that the changes may be synchronized with FSDirectory
> when I
> >> > use writer.commit().
> >> >
> >> > When I use RAMDirectory, the writer.commit() seems much faster than
> using
> >> > NRTManager built upon FSDirectory. So, I am guessing the difference is
> >> the
> >> > index synchronization.
> >> >
> >> >
> >> >
> >> > On Mon, Feb 6, 2012 at 11:40 PM, Uwe Schindler <uwe@thetaphi.de>
> wrote:
> >> >
> >> >> Please review the following articles about NRT, absolutely instant
> >> updates
> >> >> that are visible as they are done are almost impossible (even with
> >> >> RAMDirectory):
> >> >>
> >> >> http://goo.gl/mzAHt
> >> >> http://goo.gl/5RoPx
> >> >> http://goo.gl/vSJ7x
> >> >>
> >> >> Uwe
> >> >>
> >> >> -----
> >> >> Uwe Schindler
> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> >> http://www.thetaphi.de
> >> >> eMail: uwe@thetaphi.de
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Cheng [mailto:zhoucheng2008@gmail.com]
> >> >> > Sent: Monday, February 06, 2012 4:27 PM
> >> >> > To: java-user@lucene.apache.org
> >> >> > Subject: Re: Configure writer to write to FSDirectory?
> >> >> >
> >> >> > Ian,
> >> >> >
> >> >> > I encountered an issue that I need to frequently update the index.
> The
> >> >> > NRTManager seems not very helpful on this front as the speed is
> slower
> >> >> than
> >> >> > RAMDirectory is used.
> >> >> >
> >> >> > Any improvement advice?
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Mon, Feb 6, 2012 at 10:24 PM, Cheng <zhoucheng2008@gmail.com>
> >> wrote:
> >> >> >
> >> >> > > That really helps! I will try it out.
> >> >> > >
> >> >> > > Thanks.
> >> >> > >
> >> >> > >
> >> >> > > On Mon, Feb 6, 2012 at 10:12 PM, Ian Lea <ian.lea@gmail.com>
> wrote:
> >> >> > >
> >> >> > >> You would use NRTManagerReopenThread as a standalone
thread, not
> >> >> > >> plugged into your Executor stuff.  It is a utility class
which
> you
> >> >> > >> don't have to use.  See the javadocs.
> >> >> > >>
> >> >> > >> But in your case I'd use it, to start with anyway.  Fire
it up
> with
> >> >> > >> suitable settings and forget about it, except to call
close()
> >> >> > >> eventually. Once you've got things up and running you
can tweak
> >> >> > >> things as much as you want but you appear to be having
trouble
> >> >> > >> getting up and running.
> >> >> > >>
> >> >> > >> So ... somewhere in the initialisation code of your app,
create
> an
> >> >> > >> IndexWriter, NRTManager + ReopenThread and SearcherManager
as
> >> >> > >> outlined before.  Then pass the NRTManager to any/all
write
> methods
> >> >> > >> or threads and the SearcherManager instance to any/all
search
> >> methods
> >> >> > >> or threads and you're done.  If you want to use threads
that are
> >> part
> >> >> > >> of your ExecutorService, fine.  Just wrap it all together
in
> >> whatever
> >> >> > >> combination of Thread or Runnable instances you want.
> >> >> > >>
> >> >> > >>
> >> >> > >> Does that help?
> >> >> > >>
> >> >> > >>
> >> >> > >> --
> >> >> > >> Ian.
> >> >> > >>
> >> >> > >>
> >> >> > >> > I don't understand this following portion:
> >> >> > >> >
> >> >> > >> > IndexWriter iw = new IndexWriter(whatever - some
standard disk
> >> >> > >> > index); NRTManager nrtm = new NRTManager(iw, null);
> >> >> > >> > NRTManagerReopenThread ropt = new NRTManagerReopenThread(nrtm,
> >> >> > >> > ...); ropt.setXxx(...); ....
> >> >> > >> > ropt.start();
> >> >> > >> >
> >> >> > >> > I have a java ExecutorServices instance running
which take
> care
> >> of
> >> >> > >> > my
> >> >> > >> own
> >> >> > >> > applications. I don't know how this NRTManagerReopenThread
> works
> >> >> > >> > with my own ExecutorService instance.
> >> >> > >> >
> >> >> > >> > Can both work together? How can the NRTManagerReopenThread
> >> >> > instance
> >> >> > >> ropt be
> >> >> > >> > plugged into my own multithreading framework?
> >> >> > >> >
> >> >> > >> > On Mon, Feb 6, 2012 at 8:17 PM, Ian Lea <ian.lea@gmail.com>
> >> wrote:
> >> >> > >> >
> >> >> > >> >> If you can use NRTManager and SearcherManager
things should
> be
> >> >> > >> >> easy and blazingly fast rather than unbearably
slow.  The
> latter
> >> >> > >> >> phrase is not one often associated with lucene.
> >> >> > >> >>
> >> >> > >> >> IndexWriter iw = new IndexWriter(whatever -
some standard
> disk
> >> >> > >> >> index); NRTManager nrtm = new NRTManager(iw,
null);
> >> >> > >> >> NRTManagerReopenThread ropt = new
> >> >> > NRTManagerReopenThread(nrtm,
> >> >> > >> >> ...); ropt.setXxx(...); ...
> >> >> > >> >> ropt.start();
> >> >> > >> >>
> >> >> > >> >> SearcherManager srchm = nrtm.getSearcherManager(b);
> >> >> > >> >>
> >> >> > >> >> Then add docs to your index via nrtm.addDocument(d),
update
> with
> >> >> > >> >> nrtm.updateDocument(...), and to search use
> >> >> > >> >>
> >> >> > >> >> IndexSearcher searcher = srchm.acquire(); try
{  search ...
> >> >> > >> >> } finally {
> >> >> > >> >>  srchm.release(searcher);
> >> >> > >> >> }
> >> >> > >> >>
> >> >> > >> >> All thread safe so you don't have to worry about
any
> >> complications
> >> >> > >> >> there.  And I bet it'll be blindingly fast.
> >> >> > >> >>
> >> >> > >> >> Don't forget to close() things down at the end.
> >> >> > >> >>
> >> >> > >> >>
> >> >> > >> >> --
> >> >> > >> >> Ian.
> >> >> > >> >>
> >> >> > >> >>
> >> >> > >> >>
> >> >> > >> >> On Mon, Feb 6, 2012 at 12:15 AM, Cheng <
> zhoucheng2008@gmail.com
> >> >
> >> >> > >> wrote:
> >> >> > >> >> > I was trying to, but don't know how to
even I read some of
> >> your
> >> >> > >> blogs.
> >> >> > >> >> >
> >> >> > >> >> > On Sun, Feb 5, 2012 at 10:22 PM, Michael
McCandless <
> >> >> > >> >> > lucene@mikemccandless.com> wrote:
> >> >> > >> >> >
> >> >> > >> >> >> Are you using near-real-time readers?
> >> >> > >> >> >>
> >> >> > >> >> >> (IndexReader.open(IndexWriter))
> >> >> > >> >> >>
> >> >> > >> >> >> Mike McCandless
> >> >> > >> >> >>
> >> >> > >> >> >> http://blog.mikemccandless.com
> >> >> > >> >> >>
> >> >> > >> >> >> On Sun, Feb 5, 2012 at 9:03 AM, Cheng
<
> >> zhoucheng2008@gmail.com>
> >> >> > >> wrote:
> >> >> > >> >> >> > Hi Uwe,
> >> >> > >> >> >> >
> >> >> > >> >> >> > My challenge is that I need to
update/modify the indexes
> >> >> > >> frequently
> >> >> > >> >> while
> >> >> > >> >> >> > providing the search capability.
I was trying to use
> >> >> > >> >> >> > FSDirectory,
> >> >> > >> but
> >> >> > >> >> >> found
> >> >> > >> >> >> > out that the reading and writing
from/to FSDirectory is
> >> >> > >> >> >> > unbearably
> >> >> > >> >> slow.
> >> >> > >> >> >> So
> >> >> > >> >> >> > I now am trying the RAMDirectory,
which is fast.
> >> >> > >> >> >> >
> >> >> > >> >> >> > I don't know of  MMapDirectory,
and wonder if it is as
> fast
> >> >> > >> >> >> > as
> >> >> > >> >> >> RAMDirectory.
> >> >> > >> >> >> >
> >> >> > >> >> >> >
> >> >> > >> >> >> > On Sun, Feb 5, 2012 at 4:14 PM,
Uwe Schindler
> >> >> > >> >> >> > <uwe@thetaphi.de>
> >> >> > >> >> wrote:
> >> >> > >> >> >> >
> >> >> > >> >> >> >> Hi Cheng,
> >> >> > >> >> >> >>
> >> >> > >> >> >> >> It seems that you use a RAMDirectory
for *caching*,
> >> >> > >> >> >> >> otherwise it
> >> >> > >> >> makes
> >> >> > >> >> >> no
> >> >> > >> >> >> >> sense to write changes back.
In recent Lucene versions,
> >> this
> >> >> > >> >> >> >> is
> >> >> > >> not a
> >> >> > >> >> >> good
> >> >> > >> >> >> >> idea, especially for large
indexes (RAMDirectory eats
> your
> >> >> > >> >> >> >> heap
> >> >> > >> >> space,
> >> >> > >> >> >> >> allocates millions of small
byte[] arrays,...). If you
> >> need
> >> >> > >> something
> >> >> > >> >> >> like
> >> >> > >> >> >> >> a
> >> >> > >> >> >> >> caching Directory and you
are working on a 64bit
> platform,
> >> >> > >> >> >> >> you
> >> >> > >> can
> >> >> > >> >> use
> >> >> > >> >> >> >> MMapDirectory (where the operating
system kernel
> manages
> >> the
> >> >> > >> >> read/write
> >> >> > >> >> >> >> between disk an memory). MMapDirectory
is returned by
> >> >> > >> >> >> >> default for
> >> >> > >> >> >> >> FSDirectory.open() on most
64 bit platforms. The good
> >> thing:
> >> >> > >> >> >> >> the
> >> >> > >> >> >> "caching"
> >> >> > >> >> >> >> space is outside your JVM
heap, so does not slowdown
> the
> >> >> > >> >> >> >> garbage
> >> >> > >> >> >> collector.
> >> >> > >> >> >> >> So be sure to *not* allocate
too much heap space
> (-Xmx) to
> >> >> > >> >> >> >> your
> >> >> > >> >> search
> >> >> > >> >> >> app,
> >> >> > >> >> >> >> only the minimum needed to
execute it and leave the
> rest
> >> of
> >> >> > >> >> >> >> your
> >> >> > >> RAM
> >> >> > >> >> >> >> available for the OS kernel
to manage FS cache.
> >> >> > >> >> >> >>
> >> >> > >> >> >> >> Uwe
> >> >> > >> >> >> >>
> >> >> > >> >> >> >> -----
> >> >> > >> >> >> >> Uwe Schindler
> >> >> > >> >> >> >> H.-H.-Meier-Allee 63, D-28213
Bremen
> >> http://www.thetaphi.de
> >> >> > >> >> >> >> eMail: uwe@thetaphi.de
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>
> >> >> > >> >> >> >> > -----Original Message-----
> >> >> > >> >> >> >> > From: Cheng [mailto:zhoucheng2008@gmail.com]
> >> >> > >> >> >> >> > Sent: Sunday, February
05, 2012 7:56 AM
> >> >> > >> >> >> >> > To: java-user@lucene.apache.org
> >> >> > >> >> >> >> > Subject: Configure writer
to write to FSDirectory?
> >> >> > >> >> >> >> >
> >> >> > >> >> >> >> > Hi,
> >> >> > >> >> >> >> >
> >> >> > >> >> >> >> > I build an RAMDirectory
on a FSDirectory, and would
> like
> >> >> the
> >> >> > >> writer
> >> >> > >> >> >> >> associated
> >> >> > >> >> >> >> > with the RAMDirectory
to periodically write to hard
> >> drive.
> >> >> > >> >> >> >> >
> >> >> > >> >> >> >> > Is this achievable?
> >> >> > >> >> >> >> >
> >> >> > >> >> >> >> > Thanks.
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>
> >> >> > >>
> >> ---------------------------------------------------------------------
> >> >> > >> >> >> >> To unsubscribe, e-mail:
> >> >> java-user-unsubscribe@lucene.apache.org
> >> >> > >> >> >> >> For additional commands, e-mail:
> >> >> > >> java-user-help@lucene.apache.org
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>
> >> >> > >> >> >>
> >> >> > >> >> >>
> >> >> > >>
> >> ---------------------------------------------------------------------
> >> >> > >> >> >> To unsubscribe, e-mail:
> >> java-user-unsubscribe@lucene.apache.org
> >> >> > >> >> >> For additional commands, e-mail:
> >> >> java-user-help@lucene.apache.org
> >> >> > >> >> >>
> >> >> > >> >> >>
> >> >> > >> >>
> >> >> > >> >>
> >> >> ---------------------------------------------------------------------
> >> >> > >> >> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> >> >> > >> >> For additional commands, e-mail:
> >> java-user-help@lucene.apache.org
> >> >> > >> >>
> >> >> > >> >>
> >> >> > >>
> >> >> > >>
> >> ---------------------------------------------------------------------
> >> >> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> > >> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> >> >> > >>
> >> >> > >>
> >> >> > >
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message