Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 66104 invoked from network); 21 Jul 2006 23:43:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 21 Jul 2006 23:43:32 -0000 Received: (qmail 41410 invoked by uid 500); 21 Jul 2006 23:43:31 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 40780 invoked by uid 500); 21 Jul 2006 23:43:30 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 40769 invoked by uid 99); 21 Jul 2006 23:43:30 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jul 2006 16:43:30 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [209.237.227.198] (HELO brutus.apache.org) (209.237.227.198) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jul 2006 16:43:29 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id AD74B41000A for ; Fri, 21 Jul 2006 23:41:15 +0000 (GMT) Message-ID: <29071184.1153525275708.JavaMail.jira@brutus> Date: Fri, 21 Jul 2006 16:41:15 -0700 (PDT) From: "Karl Wettin (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index In-Reply-To: <33188999.1145512002000.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12422795 ] Karl Wettin commented on LUCENE-550: ------------------------------------ In order to find the norm-error I ported all test cases. I'm sorry to report that 70 of them fails. So if anyone use this code, don't. :-) Hopefully most of the problems share the same problem. I'll be at the code this weekend, and perhaps a few days next week if needed. > InstanciatedIndex - faster but memory consuming index > ----------------------------------------------------- > > Key: LUCENE-550 > URL: http://issues.apache.org/jira/browse/LUCENE-550 > Project: Lucene - Java > Issue Type: New Feature > Components: Store > Affects Versions: 1.9 > Reporter: Karl Wettin > Attachments: class_diagram.png, class_diagram.png, instanciated_20060527.tar, InstanciatedIndexTermEnum.java, lucene.1.9-karl1.jpg > > > After fixing the bugs, it's now 4.5 -> 5 times the speed. This is true for both at index and query time. Sorry if I got your hopes up too much. There are still things to be done though. Might not have time to do anything with this until next month, so here is the code if anyone wants a peek. > Not good enough for Jira yet, but if someone wants to fool around with it, here it is. The implementation passes a TermEnum -> TermDocs -> Fields -> TermVector comparation against the same data in a Directory. > When it comes to features, offsets don't exists and positions are stored ugly and has bugs. > You might notice that norms are float[] and not byte[]. That is me who refactored it to see if it would do any good. Bit shifting don't take many ticks, so I might just revert that. > I belive the code is quite self explaining. > InstanciatedIndex ii = .. > ii.new InstanciatedIndexReader(); > ii.addDocument(s).. replace IndexWriter for now. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org