Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0D8E8CDC1 for ; Mon, 18 Jun 2012 11:43:04 +0000 (UTC) Received: (qmail 18303 invoked by uid 500); 18 Jun 2012 11:43:02 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 18044 invoked by uid 500); 18 Jun 2012 11:43:01 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 18022 invoked by uid 99); 18 Jun 2012 11:43:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2012 11:43:01 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of zhoucheng2008@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-lpp01m010-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2012 11:42:57 +0000 Received: by lagz14 with SMTP id z14so4106812lag.35 for ; Mon, 18 Jun 2012 04:42:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=VAyGwFaIl1AP/RADtLR7FHNkigk3X3IJyZzOeze53r4=; b=QPB9rASEgCclEQielfUo7P8qEpx7y2u00MhA6Uy55fawU2izh6fYA11blWB9Ys0LUp RH4AbpqroVYxTTBD8i2EMHaQzN0IUtzFAIYgqoWD+WxxMqWqvFaSfe9FzJ1jlqKHr98W qC9r9Q4WN5S667HwFD4LUGVEzjxG5o8SonkO3nToQIZ4zp8bagxR5ruttP+G8UGVOysS GY9HTmrYqr5Je8nUviFkxyJE71hIxmhYE+akXZ1YLAiu+dykV1zVPGLSi0neQUJvjkzB bPlCLjL0/SbqMPIk7s12Ipb0G7Cv1eDsz/snkvYKUOTD6ll3ZlPsRjuMd47eh6t9+5OX rw8w== MIME-Version: 1.0 Received: by 10.112.85.39 with SMTP id e7mr6266474lbz.56.1340019755266; Mon, 18 Jun 2012 04:42:35 -0700 (PDT) Received: by 10.114.28.41 with HTTP; Mon, 18 Jun 2012 04:42:35 -0700 (PDT) In-Reply-To: References: <885dea4a-bfd8-4c22-8c02-6b926d9e0f30@email.android.com> <017101cd42ad$30be5d90$923b18b0$@thetaphi.de> Date: Mon, 18 Jun 2012 19:42:35 +0800 Message-ID: Subject: Re: RAMDirectory unexpectedly slows From: Cheng To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=bcaec555547e81314604c2bda964 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec555547e81314604c2bda964 Content-Type: text/plain; charset=ISO-8859-1 Lucene is used in the following steps: 1) store interested data in Lucene indexes; 2) search key words against the indexes; 3) write new data into indexes and refresh the reader; 4) use the reader to search key words, and the 2-4 steps repeat. As you see, there are lots of read, update actions. I guess since MMapDir needs to synchronize to a local drive that causes it to be slower. The code is attached: public class YYTLucene { private static Logger logger = Logger.getLogger(YYTLuceneImpl.class); private static FSDirectory indexDir; private static RAMDirectory ramDir; // private static MMapDirectory ramDir; private static IndexWriter iw; private static IndexSearcher is; private static IndexReader ir; private static YYTLucene instance; public static YYTLucene getInstance(String type) { if (instance == null) { instance = new YYTLucene(type); } return instance; } private YYTLucene(String type) { try { indexDir = new NIOFSDirectory(new File(ERConstants.indexFolder1 + "/" + type)); ramDir = new RAMDirectory(indexDir); // ramDir = new MMapDirectory(new File(ERConstants.indexFolder1 + // "/" // + type)); IndexWriterConfig iwConfig = new IndexWriterConfig( ERConstants.version, new LimitTokenCountAnalyzer( ERConstants.analyzer, ERConstants.maxTokenNum)); // iwConfig.setMaxBufferedDocs(ERConstants.maxBufferedDocs); // // iwConfig.setRAMBufferSizeMB(ERConstants.RAMBufferSizeMB); iw = new IndexWriter(ramDir, iwConfig); iw.commit(); ir = IndexReader.open(iw, true); is = new IndexSearcher(ir); } catch (IOException e) { e.printStackTrace(); logger.info("Can't initiate YYTLuceneImpl..."); } } public IndexWriter getIndexWriter() { return iw; } public void setIndexWriter(IndexWriter iw) { YYTLucene.iw = iw; } public IndexSearcher getIndexSearcher() { return is; } public void setIndexSearcher(IndexSearcher is) { YYTLucene.is = is; } public IndexReader getIndexReader() { return ir; } public static void setIndexReader(IndexReader ir) { YYTLucene.ir = ir; } } On Mon, Jun 18, 2012 at 7:32 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > 9 fold improvement using RAMDir over MMapDir is much more than I've > seen (~30-40% maybe) in the past. > > Can you explain how you are using Lucene? > > You may also want to try the CachingRAMDirectory patch on > https://issues.apache.org/jira/browse/LUCENE-4123 > > Mike McCandless > > http://blog.mikemccandless.com > > On Sat, Jun 16, 2012 at 7:18 AM, Cheng wrote: > > After a number of test, the performance of MMapDirectory is not even > close > > to that of RAMDirectory, in terms of speed. > > > > My application w/ the former can only deal with 10 tasks per round while > it > > could handle over 90 w/ RAMDirectory. > > > > I use the application in Linux. > > > > What can be the reasons? > > > > Thanks. > > > > > > On Tue, Jun 5, 2012 at 7:53 AM, Uwe Schindler wrote: > > > >> This is managed by your operating system. In general OS kernels like > Linux > >> or Windows use all free memory to cache disk accesses. > >> > >> ----- > >> Uwe Schindler > >> H.-H.-Meier-Allee 63, D-28213 Bremen > >> http://www.thetaphi.de > >> eMail: uwe@thetaphi.de > >> > >> > >> > -----Original Message----- > >> > From: Cheng [mailto:zhoucheng2008@gmail.com] > >> > Sent: Monday, June 04, 2012 6:10 PM > >> > To: java-user@lucene.apache.org > >> > Subject: Re: RAMDirectory unexpectedly slows > >> > > >> > Can I control the size of ram given to either MMapDirectory or > >> > ByteBufferDirectory? > >> > > >> > On Mon, Jun 4, 2012 at 11:42 PM, Uwe Schindler > wrote: > >> > > >> > > Hi, > >> > > > >> > > If you are using MMapDirectory or this ByteBufferDirectory (which is > >> > > similar to the first) the used RAM is outside JVM heap, it is in the > >> > > FS cache of the OS kernel. Giving too much memory to the JVM > penalizes > >> > > the OS cache, so give only as much as the App needs. Lucene and the > OS > >> > > kernel will then utilize the remaining memory for caching. > >> > > > >> > > Please read docs of MMapDirectory and inform yourself about mmap in > >> e.g. > >> > > Wikipedia. > >> > > > >> > > Uwe > >> > > -- > >> > > Uwe Schindler > >> > > H.-H.-Meier-Allee 63, 28213 Bremen > >> > > http://www.thetaphi.de > >> > > > >> > > > >> > > > >> > > Cheng schrieb: > >> > > > >> > > Please shed more insight into the difference between JVM heap size > and > >> > > the memory size used by Lucene. > >> > > > >> > > What I am getting at is that no matter however much ram I give my > >> > > apps, Lucene can't utilize it. Is that right? > >> > > > >> > > What about the ByteBufferDirectory? Can this specific directory > >> > > utilize the 2GB memory I grant to the app? > >> > > > >> > > On Mon, Jun 4, 2012 at 10:58 PM, Jason Rutherglen < > >> > > jason.rutherglen@gmail.com> wrote: > >> > > > >> > > > If you want the index to be stored completely in RAM, there is the > >> > > > ByteBuffer directory [1]. Though I do not see the point in putting > >> > > > an index in RAM, it will be cached in RAM regardless in the OS > >> > > > system IO cache. > >> > > > > >> > > > 1. > >> > > > > >> > > > https://github.com/elasticsearch/elasticsearch/blob/master/src/main/ja > >> > > va/org/apache/lucene/store/bytebuffer/ByteBufferDirectory.java > >> > > > > >> > > > On Mon, Jun 4, 2012 at 10:55 AM, Cheng > >> > wrote: > >> > > > > My indexes are 500MB+. So it seems like that RAMDirectory is not > >> > > > > good > >> > > for > >> > > > > that big a size. > >> > > > > > >> > > > > My challenge, on the other side, is that I need to update the > >> > > > > indexes > >> > > > very > >> > > > > frequently. So, do you think MMapDirectory is the solution? > >> > > > > > >> > > > > Thanks. > >> > > > > > >> > > > > On Mon, Jun 4, 2012 at 10:30 PM, Jack Krupansky < > >> > > jack@basetechnology.com > >> > > > >wrote: > >> > > > > > >> > > > >> From the javadoc for RAMDirectory: > >> > > > >> > >> > > > >> "Warning: This class is not intended to work with huge indexes. > >> > > > Everything > >> > > > >> beyond several hundred megabytes will waste resources (GC > >> > > > >> cycles), > >> > > > because > >> > > > >> it uses an internal buffer size of 1024 bytes, producing > millions > >> > > > >> of byte[1024] arrays. This class is optimized for small > >> > > > >> memory-resident indexes. It also has bad concurrency on > >> multithreaded > >> > environments. > >> > > > >> > >> > > > >> It is recommended to materialize large indexes on disk and use > >> > > > >> MMapDirectory, which is a high-performance directory > >> > > > >> implementation > >> > > > working > >> > > > >> directly on the file system cache of the operating system, so > >> > > > >> copying > >> > > > data > >> > > > >> to Java heap space is not useful." > >> > > > >> > >> > > > >> -- Jack Krupansky > >> > > > >> > >> > > > >> -----Original Message----- From: Cheng > >> > > > >> Sent: Monday, June 04, 2012 10:08 AM > >> > > > >> To: java-user@lucene.apache.org > >> > > > >> Subject: RAMDirectory unexpectedly slows > >> > > > >> > >> > > > >> > >> > > > >> Hi, > >> > > > >> > >> > > > >> My apps need to read from and write to some big indexes > >> frequently. > >> > > So I > >> > > > >> use RAMDirectory instead of FSDirectory, and give JVM about 2GB > >> > > > >> memory size. > >> > > > >> > >> > > > >> I notice that the speed of reading and writing unexpectedly > slows > >> > > > >> as > >> > > the > >> > > > >> size of the indexes increases. Since the usage of RAM is less > >> > > > >> than > >> > > 20%, > >> > > > I > >> > > > >> think by default the RAMDirectory doesn't take advantage of the > >> > > memory I > >> > > > >> assigned to JVM. > >> > > > >> > >> > > > >> What are the steps to improve the reading and writing speed of > >> > > > >> RAMDirectory? > >> > > > >> > >> > > > >> Thanks! > >> > > > >> Jeff > >> > > > >> > >> > > > >> > >> > > >_____________________________________________ > >> > > **_____________________________________________ > >> > > **--------- > >> > > > >> To unsubscribe, e-mail: > >> > > > >> java-user-unsubscribe@lucene.**apache.org< > >> > > > java-user-unsubscribe@lucene.apache.org> > >> > > > >> For additional commands, e-mail: > >> > > > >> java-user-help@lucene.apache.**org< > >> > > > java-user-help@lucene.apache.org> > >> > > > >> > >> > > > >> > >> > > > > >> > > >_____________________________________________ > >> > > > >> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> > > > For additional commands, e-mail: java-user-help@lucene.apache.org > >> > > > > >> > > > > >> > > > >> > > > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> For additional commands, e-mail: java-user-help@lucene.apache.org > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --bcaec555547e81314604c2bda964--