lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Stoppelman <stop...@gmail.com>
Subject Re: Reloading RAM Directory from updated FS Directory
Date Wed, 10 Jun 2009 17:11:41 GMT
Another potential idea would be to break up the index into N indices
such that each index is small enough to fit two in memory and then you
can swap them.
   http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/MultiReader.html
This is just an idea, I haven't tried it.

My current solution for the warm up problem is that I use "cat
/path/to/index/* > /dev/null" to force the new index into the disk
cache. Kind of lame but this works for getting a new index into disk
cache if you don't have enough RAM to fit two indices in RAM for the
swap. I'll also mention that I remove the box from the load balancer
while this disk cache warming is occurring.

M



On Wed, Jun 10, 2009 at 9:17 AM, Diamond, Greg<GDiamond@harryfox.com> wrote:
> Thanks for the responses.  I am testing it out using MMapDirectory.
>
> Cheers!
>
> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Wednesday, June 10, 2009 6:36 AM
> To: java-user@lucene.apache.org
> Subject: RE: Reloading RAM Directory from updated FS Directory
>
>
> There is currently a patch/idea from Earwin around that modifies
> MMapDirectory to optionally call MappedByteBuffer.load() after mapping a
> file from the directory. MappedByteBuffer.load() tells the operating system
> kernel to try to swap as much as possible from the file into physical RAM.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: eks dev [mailto:eksdev@yahoo.co.uk]
>> Sent: Wednesday, June 10, 2009 12:30 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Reloading RAM Directory from updated FS Directory
>>
>>
>> there is one case where MMAP does not beat RAM, initial warm-up after
>> process restart. With MMAP it can take a while before you get up to speed.
>> MMAP with reopen is the best, if you run without restart.
>>
>>
>>
>>
>>
>> ----- Original Message ----
>> > From: Uwe Schindler <uwe@thetaphi.de>
>> > To: java-user@lucene.apache.org
>> > Sent: Wednesday, 10 June, 2009 0:24:35
>> > Subject: RE: Reloading RAM Directory from updated FS Directory
>> >
>> > Hi Greg & Kay,
>> >
>> > > Lucene (IndexSearchers / IndexReaders) has the notion of cache as well
>> > > so you need to check if you really want a 100% replication of
>> > > RAMDirectory / FSDirectory as well concurrently in memory.  Have you
>> > > tested with the FieldCache policies before moving onto the
>> RAMDirectory
>> > > based backup solution.
>> >
>> > As you have so big indexes, I think you use a 64 bit machine and 64 bit
>> JVM.
>> > I would recommend you in this case to use MMapDirectory instead of
>> copying
>> > the contents to a RAMDirectory. MMapDirectory works like FSDirectory,
>> but it
>> > does not read the files via normal RandomAccessFile actions, but it maps
>> the
>> > file into the 64 bit address space. In principle this is like a swap
>> file,
>> > the contents of the file are seen like memory. The operating system
>> kernel
>> > then uses the same strategies to map parts of the index into real RAM,
>> just
>> > like a swap file. For Lucene it looks like a RAMDirectory backed by a
>> file.
>> >
>> > Changes to the underlying index can be seen immediately, and changes can
>> > reloaded using IndexReader.reopen(), which is much faster than reopening
>> the
>> > index in complete.
>> >
>> > Kay Kay: Instantiating an IndexSearcher on top of a IndexReader costs
>> > nothing (its just a wrapper around the IndexReader). The costly part is
>> > opening the IndexReader. So using reopen() and wrapping this always with
>> a
>> > fresh IndexReader is faster than creating an IndexSearcher using the
>> > directory path.
>> >
>> > Uwe
>> >
>> > > Diamond, Greg wrote:
>> > > > Hi All -
>> > > >
>> > > > What is the best way to load a RAM Directory from a FS Directory,
>> and
>> > > periodically reload the RAM Directory to pick up new documents?
>> > > >
>> > > > The scenario I have is I create several large directories which I
>> create
>> > > to a file system, then load them into ram for faster searching.
>> > > > They takes several hours to create, so i want to retain a file copy
>> so
>> > > in the event of a service/server crash or reboot, they can load again
>> > > > in a few seconds.  Once a day I append or recreate the FS
>> Directories
>> > > with new entries, and then reload the RAM Directories to pick up the
>> > > > new entries.
>> > > >
>> > > > Problem is, what I am doing seems to cause a memory leak.  The
>> > > directories take ~12 GB of RAM to load, but bloats
>> > > > to 24 GB (the -Xmx setting) after a few reloads and stays there.
>> > > >
>> > > >     // Map to hold single instance of the IndexSearcher, 1 per RAM
>> > > Directory, to be reused across requests.
>> > > >     private static final Mapsearchers = new
>> > > HashMap();
>> > > >
>> > > >     // Creates the IndexSearcher and stores it in the static map.
>> > > >     public static void load(IndexName index, ...) {
>> > > >         // sync... etc
>> > > >         RAMDirectory dir = new
>> > > RAMDirectory(FSDirectory.getDirectory(directoryPath);
>> > > >         IndexReader reader = IndexReader.open(dir, true);
>> > > >         IndexSearcher searcher = new IndexSearcher(reader);
>> > > >         searchers.put(index, searcher);
>> > > >     }
>> > > >
>> > > >     public static IndexSearcher get(IndexName index, ...) {
>> > > >         // sync... etc
>> > > >         return searchers.get(index);
>> > > >     }
>> > > >
>> > > >     // After an indexing service appends the existing FS Directory,
>> this
>> > > reloads it.
>> > > >     public static void reload(IndexName index, ...) {
>> > > >         // sync... etc
>> > > >         IndexSearcher searcher = searchers.get(index);
>> > > >         FSDirectory fsDirectory =
>> > > FSDirectory.getDirectory(directoryPath);
>> > > >         Directory ramDirectory =
>> > > searcher.getIndexReader().directory();
>> > > >         Directory.copy(fsDirectory, ramDirectory, false);
>> > > >     }
>> > > >
>> > > > I've tries some theme and variations on this with the same issue.
>> > > >
>> > > > TIA!
>> > > >
>> > > > gd
>> > > >
>> > > >
>> > > > --------------------------------------------------------------------
>> -
>> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
>> > > >
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message