lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karl wettin <>
Subject Re: Using Lucene for searching tokens, not storing them.
Date Thu, 20 Apr 2006 05:47:02 GMT

20 apr 2006 kl. 07.29 skrev karl wettin:

> 18 apr 2006 kl. 22.08 skrev karl wettin:
>> After adding a couple of binary searches in well needed places  
>> (and a couple of new bugs that in a few cases affects the results)  
>> I'm now down at 1/8th of the time compared to RAMDirectory. That  
>> is really fast if you ask me.
> After fixing the bugs, it's now 4.5 -> 5 times the speed. This is  
> true for both at index and query time. Sorry if I got your hopes up  
> too much. There are still things to be done though. Might not have  
> time to do anything with this until next month, so here is the code  
> if anyone wants a peek.
> Not good enough for Jira yet, but if someone wants to fool around  
> with it, here it is. The implementation passes a TermEnum ->  
> TermDocs -> Fields -> TermVector comparation against the same data  
> in a Directory.
> When it comes to features, offsets don't exists and positions are  
> stored ugly and has bugs.
> You might notice that norms are float[] and not byte[]. That is me  
> who refactored it to see if it would do any good. Bit shifting  
> don't take many ticks, so I might just revert that.
> I belive the code is quite self explaining.
> InstanciatedIndex ii = ..
> InstanciatedIndexReader();
> ii.addDocument(s).. replace IndexWriter for now.

No attachments allowed ey?

Ok, I'll pop it in the Jira then.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message