lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanne Grinovero <sanne.grinov...@gmail.com>
Subject Re: RAMDirectory doesn't win over FSDirectory all the time, why?
Date Fri, 17 Jun 2011 09:42:56 GMT
Hello,
I came to similar conclusions, and have a similar comparison test
available here:
https://github.com/infinispan/infinispan/blob/master/lucene-directory/src/test/java/org/infinispan/lucene/profiling/PerformanceCompareStressTest.java

In my test I explicitly run the RAMDirectory first to warmup the JVM
and the other Lucene components; also while I default to a short
testing time to perform a fair comparison you should:
a) make the test quite long - a couple of hours
b) this version starts with an empty index and slowly grows up, it
might make more sense to start with a fairly large index.

I'm running the RAMDirectory first as to be fair in my case I wasn't
very interested in it's performance: being limited to the available
memory on your JVM is imho quite a dealbreaker for real applications,
and also since the operating system can apply several smart caches
when there's enough memory, my conclusion is that when you have
memory, you should limit the JVM heap and leave that to the OS to make
better use of FSDirectory, as this implementation is really well
optimized, at least for local disks.

When you don't have enough available memory, I would suggest - but
warning: I'm biased - to try the Infinispan based Lucene Directory
which is able to "join forces" the memory of multiple (remote) JVMs
and passivate on external storage such as disk only when strictly
needed (or for backups/shutdown): being still mostly an in memory
solution it's able to outperform the FSDirectory during write
operations, and is comparable in search performance, in some cases a
little bit slower but it compensates by being able to scale
horizontally with real time distribution. A current limitation is that
you still need to use a single IndexWriter, even cluster-wide: the
code is very simple and directly mimics the FSDirectory logic, so it
supports all the same features and inherits the same limitations
unlike other distributed solutions.

Regards,
Sanne

2011/6/17 Lance Norskog <goksron@gmail.com>:
> The RAMDirectory uses Java memory, an FSDirectory does not. Holding
> Java memory makes garbage collection work harder. The operating system
> is very very good at managing disk buffers, and does a better job
> using spare memory than Java does.
>
> For real-world sites, RAMDirectory is almost always useless. Maybe the
> Instantiated index stuff is more what you want?
>
> Lance
>
> On Tue, Jun 7, 2011 at 2:52 AM, zhoucheng2008 <zhoucheng2008@gmail.com> wrote:
>> Makes sense. Thanks
>>
>> -----Original Message-----
>> From: Toke Eskildsen [mailto:te@statsbiblioteket.dk]
>> Sent: Tuesday, June 07, 2011 4:28 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: RAMDirectory doesn't win over FSDirectory all the time, why?
>>
>> On Mon, 2011-06-06 at 15:29 +0200, zhoucheng2008 wrote:
>>> I read the lucene in action book and just tested the
>>> FSversusRAMDirectoryTest.java with the following uncommented:
>>> [...]Here is the output:
>>>
>>> RAMDirectory Time: 805 ms
>>>
>>> FSDirectory Time : 728 ms
>>
>> This is the code, right?
>> http://java.codefetch.com/example/in/LuceneInAction/src/lia/indexing/FSversusRAMDirectoryTest.java
>>
>> The test is problematic as the same two tests run sequentially.
>>
>> If you change
>>  long ramTiming = timeIndexWriter(ramDir);
>>  long fsTiming = timeIndexWriter(fsDir);
>> to
>>  long fsTiming = timeIndexWriter(fsDir);
>>  long ramTiming = timeIndexWriter(ramDir);
>> my guess is that RAMDirectory will be faster. For a better
>> comparison, perform each test in separate runs (make a test
>> class just for RAMDirectory and one just for FSDirectory,
>> then run them one at a time, each in its own JVM).
>>
>> One big problem when comparing RAMDirectory to file-access
>> is caching. What you measure with a test might not be what
>> you see in production, as the production index might be
>> large compared to RAM available for file caching.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message