Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 38270 invoked from network); 20 Apr 2006 05:45:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 20 Apr 2006 05:45:16 -0000 Received: (qmail 82282 invoked by uid 500); 20 Apr 2006 05:45:13 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 82244 invoked by uid 500); 20 Apr 2006 05:45:13 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 82226 invoked by uid 99); 20 Apr 2006 05:45:12 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2006 22:45:12 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [195.121.6.175] (HELO hnexfe09.hetnet.nl) (195.121.6.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2006 22:45:11 -0700 Received: from [192.168.0.100] ([86.85.154.64]) by hnexfe09.hetnet.nl with Microsoft SMTPSVC(5.0.2195.6874); Thu, 20 Apr 2006 07:44:49 +0200 Mime-Version: 1.0 (Apple Message framework v746.3) In-Reply-To: <1EA05F85-9874-48C2-A616-3E35C74AC3E9@snigel.net> References: <53423917-9808-4EA8-996A-95F7CD6218FD@snigel.net> <443FCE53.9050707@apache.org> <678FBE2D-7BDF-4609-ADF3-4B4BD2844BEB@snigel.net> <200604152132.18437.paul.elschot@xs4all.nl> <230F88BC-DC49-4128-9303-DE2FE4E2FED8@snigel.net> <58912026-8F34-4D9D-80B7-13D89CC16591@snigel.net> <5A2ED106-2378-488C-8009-CF3239EF00FF@snigel.net> <1EA05F85-9874-48C2-A616-3E35C74AC3E9@snigel.net> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <0DE6108C-00D7-41F1-8518-6FC7E3918B77@snigel.net> Content-Transfer-Encoding: 7bit From: karl wettin Subject: Re: Using Lucene for searching tokens, not storing them. Date: Thu, 20 Apr 2006 07:47:02 +0200 To: java-dev@lucene.apache.org X-Mailer: Apple Mail (2.746.3) X-OriginalArrivalTime: 20 Apr 2006 05:44:49.0711 (UTC) FILETIME=[89F75BF0:01C6643D] X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N 20 apr 2006 kl. 07.29 skrev karl wettin: > > 18 apr 2006 kl. 22.08 skrev karl wettin: > >> After adding a couple of binary searches in well needed places >> (and a couple of new bugs that in a few cases affects the results) >> I'm now down at 1/8th of the time compared to RAMDirectory. That >> is really fast if you ask me. > > After fixing the bugs, it's now 4.5 -> 5 times the speed. This is > true for both at index and query time. Sorry if I got your hopes up > too much. There are still things to be done though. Might not have > time to do anything with this until next month, so here is the code > if anyone wants a peek. > > Not good enough for Jira yet, but if someone wants to fool around > with it, here it is. The implementation passes a TermEnum -> > TermDocs -> Fields -> TermVector comparation against the same data > in a Directory. > > When it comes to features, offsets don't exists and positions are > stored ugly and has bugs. > > You might notice that norms are float[] and not byte[]. That is me > who refactored it to see if it would do any good. Bit shifting > don't take many ticks, so I might just revert that. > > I belive the code is quite self explaining. > > InstanciatedIndex ii = .. > ii.new InstanciatedIndexReader(); > ii.addDocument(s).. replace IndexWriter for now. No attachments allowed ey? Ok, I'll pop it in the Jira then. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org