lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Configure writer to write to FSDirectory?
Date Mon, 06 Feb 2012 16:52:23 GMT
You tell NRTCachingDirectory how much RAM it's allowed to use, and it
then caches newly flushed segments in a private RAMDirectory.

But you should first test performance w/o it (after removing the
commit calls).  NRT is very fast...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Feb 6, 2012 at 11:46 AM, Cheng <zhoucheng2008@gmail.com> wrote:
> Good point. I should remove the commits.
>
> Any difference between NRTCashingDirectory and RAMDirectory? how to define
> the "small"?
>
> On Tue, Feb 7, 2012 at 12:42 AM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> You shouldn't call IW.commit when using NRT; that's the point of NRT
>> (making changes visible w/o calling commit).
>>
>> Only call commit when you require that all changes be durable (surive
>> OS / JVM crash, power loss, etc.) on disk.
>>
>> Also, you can use NRTCachingDirectory which acts like RAMDirectory for
>> small flushed segments.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Mon, Feb 6, 2012 at 10:45 AM, Cheng <zhoucheng2008@gmail.com> wrote:
>> > Uwe, when I meant speed is slow, I didn't refer to instant visibility of
>> > changes, but that the changes may be synchronized with FSDirectory when I
>> > use writer.commit().
>> >
>> > When I use RAMDirectory, the writer.commit() seems much faster than using
>> > NRTManager built upon FSDirectory. So, I am guessing the difference is
>> the
>> > index synchronization.
>> >
>> >
>> >
>> > On Mon, Feb 6, 2012 at 11:40 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
>> >
>> >> Please review the following articles about NRT, absolutely instant
>> updates
>> >> that are visible as they are done are almost impossible (even with
>> >> RAMDirectory):
>> >>
>> >> http://goo.gl/mzAHt
>> >> http://goo.gl/5RoPx
>> >> http://goo.gl/vSJ7x
>> >>
>> >> Uwe
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: uwe@thetaphi.de
>> >>
>> >> > -----Original Message-----
>> >> > From: Cheng [mailto:zhoucheng2008@gmail.com]
>> >> > Sent: Monday, February 06, 2012 4:27 PM
>> >> > To: java-user@lucene.apache.org
>> >> > Subject: Re: Configure writer to write to FSDirectory?
>> >> >
>> >> > Ian,
>> >> >
>> >> > I encountered an issue that I need to frequently update the index.
The
>> >> > NRTManager seems not very helpful on this front as the speed is slower
>> >> than
>> >> > RAMDirectory is used.
>> >> >
>> >> > Any improvement advice?
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Feb 6, 2012 at 10:24 PM, Cheng <zhoucheng2008@gmail.com>
>> wrote:
>> >> >
>> >> > > That really helps! I will try it out.
>> >> > >
>> >> > > Thanks.
>> >> > >
>> >> > >
>> >> > > On Mon, Feb 6, 2012 at 10:12 PM, Ian Lea <ian.lea@gmail.com>
wrote:
>> >> > >
>> >> > >> You would use NRTManagerReopenThread as a standalone thread,
not
>> >> > >> plugged into your Executor stuff.  It is a utility class
which you
>> >> > >> don't have to use.  See the javadocs.
>> >> > >>
>> >> > >> But in your case I'd use it, to start with anyway.  Fire
it up with
>> >> > >> suitable settings and forget about it, except to call close()
>> >> > >> eventually. Once you've got things up and running you can
tweak
>> >> > >> things as much as you want but you appear to be having trouble
>> >> > >> getting up and running.
>> >> > >>
>> >> > >> So ... somewhere in the initialisation code of your app, create
an
>> >> > >> IndexWriter, NRTManager + ReopenThread and SearcherManager
as
>> >> > >> outlined before.  Then pass the NRTManager to any/all write
methods
>> >> > >> or threads and the SearcherManager instance to any/all search
>> methods
>> >> > >> or threads and you're done.  If you want to use threads that
are
>> part
>> >> > >> of your ExecutorService, fine.  Just wrap it all together
in
>> whatever
>> >> > >> combination of Thread or Runnable instances you want.
>> >> > >>
>> >> > >>
>> >> > >> Does that help?
>> >> > >>
>> >> > >>
>> >> > >> --
>> >> > >> Ian.
>> >> > >>
>> >> > >>
>> >> > >> > I don't understand this following portion:
>> >> > >> >
>> >> > >> > IndexWriter iw = new IndexWriter(whatever - some standard
disk
>> >> > >> > index); NRTManager nrtm = new NRTManager(iw, null);
>> >> > >> > NRTManagerReopenThread ropt = new NRTManagerReopenThread(nrtm,
>> >> > >> > ...); ropt.setXxx(...); ....
>> >> > >> > ropt.start();
>> >> > >> >
>> >> > >> > I have a java ExecutorServices instance running which
take care
>> of
>> >> > >> > my
>> >> > >> own
>> >> > >> > applications. I don't know how this NRTManagerReopenThread
works
>> >> > >> > with my own ExecutorService instance.
>> >> > >> >
>> >> > >> > Can both work together? How can the NRTManagerReopenThread
>> >> > instance
>> >> > >> ropt be
>> >> > >> > plugged into my own multithreading framework?
>> >> > >> >
>> >> > >> > On Mon, Feb 6, 2012 at 8:17 PM, Ian Lea <ian.lea@gmail.com>
>> wrote:
>> >> > >> >
>> >> > >> >> If you can use NRTManager and SearcherManager things
should be
>> >> > >> >> easy and blazingly fast rather than unbearably slow.
 The latter
>> >> > >> >> phrase is not one often associated with lucene.
>> >> > >> >>
>> >> > >> >> IndexWriter iw = new IndexWriter(whatever - some
standard disk
>> >> > >> >> index); NRTManager nrtm = new NRTManager(iw, null);
>> >> > >> >> NRTManagerReopenThread ropt = new
>> >> > NRTManagerReopenThread(nrtm,
>> >> > >> >> ...); ropt.setXxx(...); ...
>> >> > >> >> ropt.start();
>> >> > >> >>
>> >> > >> >> SearcherManager srchm = nrtm.getSearcherManager(b);
>> >> > >> >>
>> >> > >> >> Then add docs to your index via nrtm.addDocument(d),
update with
>> >> > >> >> nrtm.updateDocument(...), and to search use
>> >> > >> >>
>> >> > >> >> IndexSearcher searcher = srchm.acquire(); try {  search
...
>> >> > >> >> } finally {
>> >> > >> >>  srchm.release(searcher);
>> >> > >> >> }
>> >> > >> >>
>> >> > >> >> All thread safe so you don't have to worry about
any
>> complications
>> >> > >> >> there.  And I bet it'll be blindingly fast.
>> >> > >> >>
>> >> > >> >> Don't forget to close() things down at the end.
>> >> > >> >>
>> >> > >> >>
>> >> > >> >> --
>> >> > >> >> Ian.
>> >> > >> >>
>> >> > >> >>
>> >> > >> >>
>> >> > >> >> On Mon, Feb 6, 2012 at 12:15 AM, Cheng <zhoucheng2008@gmail.com
>> >
>> >> > >> wrote:
>> >> > >> >> > I was trying to, but don't know how to even
I read some of
>> your
>> >> > >> blogs.
>> >> > >> >> >
>> >> > >> >> > On Sun, Feb 5, 2012 at 10:22 PM, Michael McCandless
<
>> >> > >> >> > lucene@mikemccandless.com> wrote:
>> >> > >> >> >
>> >> > >> >> >> Are you using near-real-time readers?
>> >> > >> >> >>
>> >> > >> >> >> (IndexReader.open(IndexWriter))
>> >> > >> >> >>
>> >> > >> >> >> Mike McCandless
>> >> > >> >> >>
>> >> > >> >> >> http://blog.mikemccandless.com
>> >> > >> >> >>
>> >> > >> >> >> On Sun, Feb 5, 2012 at 9:03 AM, Cheng <
>> zhoucheng2008@gmail.com>
>> >> > >> wrote:
>> >> > >> >> >> > Hi Uwe,
>> >> > >> >> >> >
>> >> > >> >> >> > My challenge is that I need to update/modify
the indexes
>> >> > >> frequently
>> >> > >> >> while
>> >> > >> >> >> > providing the search capability. I
was trying to use
>> >> > >> >> >> > FSDirectory,
>> >> > >> but
>> >> > >> >> >> found
>> >> > >> >> >> > out that the reading and writing from/to
FSDirectory is
>> >> > >> >> >> > unbearably
>> >> > >> >> slow.
>> >> > >> >> >> So
>> >> > >> >> >> > I now am trying the RAMDirectory, which
is fast.
>> >> > >> >> >> >
>> >> > >> >> >> > I don't know of  MMapDirectory, and
wonder if it is as fast
>> >> > >> >> >> > as
>> >> > >> >> >> RAMDirectory.
>> >> > >> >> >> >
>> >> > >> >> >> >
>> >> > >> >> >> > On Sun, Feb 5, 2012 at 4:14 PM, Uwe
Schindler
>> >> > >> >> >> > <uwe@thetaphi.de>
>> >> > >> >> wrote:
>> >> > >> >> >> >
>> >> > >> >> >> >> Hi Cheng,
>> >> > >> >> >> >>
>> >> > >> >> >> >> It seems that you use a RAMDirectory
for *caching*,
>> >> > >> >> >> >> otherwise it
>> >> > >> >> makes
>> >> > >> >> >> no
>> >> > >> >> >> >> sense to write changes back. In
recent Lucene versions,
>> this
>> >> > >> >> >> >> is
>> >> > >> not a
>> >> > >> >> >> good
>> >> > >> >> >> >> idea, especially for large indexes
(RAMDirectory eats your
>> >> > >> >> >> >> heap
>> >> > >> >> space,
>> >> > >> >> >> >> allocates millions of small byte[]
arrays,...). If you
>> need
>> >> > >> something
>> >> > >> >> >> like
>> >> > >> >> >> >> a
>> >> > >> >> >> >> caching Directory and you are working
on a 64bit platform,
>> >> > >> >> >> >> you
>> >> > >> can
>> >> > >> >> use
>> >> > >> >> >> >> MMapDirectory (where the operating
system kernel manages
>> the
>> >> > >> >> read/write
>> >> > >> >> >> >> between disk an memory). MMapDirectory
is returned by
>> >> > >> >> >> >> default for
>> >> > >> >> >> >> FSDirectory.open() on most 64 bit
platforms. The good
>> thing:
>> >> > >> >> >> >> the
>> >> > >> >> >> "caching"
>> >> > >> >> >> >> space is outside your JVM heap,
so does not slowdown the
>> >> > >> >> >> >> garbage
>> >> > >> >> >> collector.
>> >> > >> >> >> >> So be sure to *not* allocate too
much heap space (-Xmx) to
>> >> > >> >> >> >> your
>> >> > >> >> search
>> >> > >> >> >> app,
>> >> > >> >> >> >> only the minimum needed to execute
it and leave the rest
>> of
>> >> > >> >> >> >> your
>> >> > >> RAM
>> >> > >> >> >> >> available for the OS kernel to
manage FS cache.
>> >> > >> >> >> >>
>> >> > >> >> >> >> Uwe
>> >> > >> >> >> >>
>> >> > >> >> >> >> -----
>> >> > >> >> >> >> Uwe Schindler
>> >> > >> >> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> >> > >> >> >> >> eMail: uwe@thetaphi.de
>> >> > >> >> >> >>
>> >> > >> >> >> >>
>> >> > >> >> >> >> > -----Original Message-----
>> >> > >> >> >> >> > From: Cheng [mailto:zhoucheng2008@gmail.com]
>> >> > >> >> >> >> > Sent: Sunday, February 05,
2012 7:56 AM
>> >> > >> >> >> >> > To: java-user@lucene.apache.org
>> >> > >> >> >> >> > Subject: Configure writer
to write to FSDirectory?
>> >> > >> >> >> >> >
>> >> > >> >> >> >> > Hi,
>> >> > >> >> >> >> >
>> >> > >> >> >> >> > I build an RAMDirectory on
a FSDirectory, and would like
>> >> the
>> >> > >> writer
>> >> > >> >> >> >> associated
>> >> > >> >> >> >> > with the RAMDirectory to periodically
write to hard
>> drive.
>> >> > >> >> >> >> >
>> >> > >> >> >> >> > Is this achievable?
>> >> > >> >> >> >> >
>> >> > >> >> >> >> > Thanks.
>> >> > >> >> >> >>
>> >> > >> >> >> >>
>> >> > >> >> >> >>
>> >> > >>
>> ---------------------------------------------------------------------
>> >> > >> >> >> >> To unsubscribe, e-mail:
>> >> java-user-unsubscribe@lucene.apache.org
>> >> > >> >> >> >> For additional commands, e-mail:
>> >> > >> java-user-help@lucene.apache.org
>> >> > >> >> >> >>
>> >> > >> >> >> >>
>> >> > >> >> >>
>> >> > >> >> >>
>> >> > >>
>> ---------------------------------------------------------------------
>> >> > >> >> >> To unsubscribe, e-mail:
>> java-user-unsubscribe@lucene.apache.org
>> >> > >> >> >> For additional commands, e-mail:
>> >> java-user-help@lucene.apache.org
>> >> > >> >> >>
>> >> > >> >> >>
>> >> > >> >>
>> >> > >> >>
>> >> ---------------------------------------------------------------------
>> >> > >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> > >> >> For additional commands, e-mail:
>> java-user-help@lucene.apache.org
>> >> > >> >>
>> >> > >> >>
>> >> > >>
>> >> > >>
>> ---------------------------------------------------------------------
>> >> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> > >>
>> >> > >>
>> >> > >
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message