Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 99879 invoked from network); 17 Mar 2007 20:12:34 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Mar 2007 20:12:34 -0000 Received: (qmail 14741 invoked by uid 500); 17 Mar 2007 20:12:41 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 14093 invoked by uid 500); 17 Mar 2007 20:12:39 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 14082 invoked by uid 99); 17 Mar 2007 20:12:38 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Mar 2007 13:12:38 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Mar 2007 13:12:30 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id EE06A714084 for ; Sat, 17 Mar 2007 13:12:09 -0700 (PDT) Message-ID: <32309322.1174162329972.JavaMail.jira@brutus> Date: Sat, 17 Mar 2007 13:12:09 -0700 (PDT) From: "Karl Wettin (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-550) InstantiatedIndex - faster but memory consuming index In-Reply-To: <33188999.1145512002000.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-550: ------------------------------- Attachment: HitCollectionBench.jpg A graph showing performance of hit collection using InstantiatedIndex, RAMDirectory and FSDirectory. In essence, there is no great win in pure search time when there are more than 7000 documents. However, retreiving documents is still not associate with any cost what so ever, so in a 250000 sized index that use Lucene for persistency of fields, I still see a boost with 6-10x or so compared to RAMDirectory. documents in corpus \t queries per second org.apache.lucene.store.instantiated.InstantiatedIndex@628704 250 37530,00 500 29610,00 750 22612,50 1000 19267,50 1250 16027,50 1500 14737,50 1750 13230,00 2000 12322,50 2250 11482,50 2500 10125,00 2750 9802,50 3000 8508,25 3250 8469,80 3500 7788,61 3750 5207,29 4000 5484,52 4250 4912,50 4500 4420,58 4750 4006,49 5000 4357,50 5250 3886,67 5500 3573,93 5750 3236,76 6000 3602,10 6250 3420,00 6500 3075,00 6750 2805,00 7000 2680,98 7250 2908,55 7500 2769,46 7750 2644,86 8000 2496,25 8250 2377,50 8500 2578,71 8750 2390,11 9000 2160,00 9250 2037,96 9500 1872,19 9750 2041,38 10000 1959,12 Created 10000 documents org.apache.lucene.index.facade.RAMDirectoryIndex@af993e 250 4845,00 500 3986,01 750 4330,67 1000 4682,82 1250 4148,78 1500 4847,65 1750 4535,23 2000 4192,50 2250 4203,30 2500 3695,65 2750 3742,50 3000 3485,76 3250 3470,76 3500 3525,00 3750 2877,61 4000 3221,78 4250 2983,51 4500 2982,02 4750 2724,55 5000 3092,86 5250 2646,18 5500 2940,00 5750 2709,58 6000 2423,30 6250 2602,50 6500 2305,39 6750 2462,57 7000 1815,00 7250 2431,42 7500 2171,74 7750 2297,90 8000 2134,30 8250 2308,85 8500 2038,98 8750 2231,65 9000 2097,90 9250 2041,38 9500 1819,77 9750 2102,24 10000 1876,87 Created 10000 documents org.apache.lucene.index.facade.FSDirectoryIndex@4112c0 250 3448,28 500 2422,50 750 2677,50 1000 2607,39 1250 2241,92 1500 2486,27 1750 2472,53 2000 1733,52 2250 2325,00 2500 2194,21 2750 1969,55 3000 2125,75 3250 2009,00 3500 1473,08 3750 1858,14 4000 1925,57 4250 1671,66 4500 1786,25 4750 1694,15 5000 1217,63 5250 1595,11 5500 1745,75 5750 1526,18 6000 1431,78 6250 1524,66 6500 1648,35 6750 1544,23 7000 1428,22 7250 1487,29 7500 1494,02 7750 1106,13 8000 1455,00 8250 1284,86 8500 1182,63 8750 1292,33 9000 1399,70 9250 1000,00 9500 1291,04 9750 1359,56 10000 1194,62 Created 10000 documents > InstantiatedIndex - faster but memory consuming index > ----------------------------------------------------- > > Key: LUCENE-550 > URL: https://issues.apache.org/jira/browse/LUCENE-550 > Project: Lucene - Java > Issue Type: New Feature > Components: Store > Affects Versions: 2.0.0 > Reporter: Karl Wettin > Assigned To: Karl Wettin > Attachments: HitCollectionBench.jpg, lucene-550.jpg, test-reports.zip, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2, trunk.diff.bz2 > > > An non file centrinc all in memory index. Consumes some 2x the memory of a RAMDirectory (in a term satured index) but is between 3x-60x faster depending on application and how one counts. Average query is about 8x faster. IndexWriter and IndexModifier have been realized in InterfaceIndexWriter and InterfaceIndexModifier. > InstantiatedIndex is wrapped in a new top layer index facade (class Index) that comes with factory methods for writers, readers and searchers for unison index handeling. There are decorators with notification handling that can be used for automatically syncronizing searchers on updates, et.c. > Index also comes with FS/RAMDirectory implementation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org