lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wettin (JIRA)" <>
Subject [jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index
Date Thu, 11 May 2006 18:46:12 GMT
    [ ] 

Karl Wettin commented on LUCENE-550:

Doug Cutting commented on LUCENE-550:

> This looks very promising.  Unfortunately the code you provide makes many incompatible
> changes (e.g., turning Term into an interface that has far fewer methods) removes lots
> useful javadoc, etc.  So please don't expect it to be committed soon!

I agree, there is lots of work to be done on it. It was eaiser for me to think clear when
everything was seperated. Basically there are only a few changes to the API that is needed:

1. Document nor Term may be final.
2. Something other minor that I forgot about.

It can all be fixed, but is nothing that I prioritize right now. If you feel it would be a
nice thing for 2.0, tolk me what changes you are OK with and gave me at least two weeks notice
I /might/ find time to back-factor the code.

> InstanciatedIndex - faster but memory consuming index
> -----------------------------------------------------
>          Key: LUCENE-550
>          URL:
>      Project: Lucene - Java
>         Type: New Feature

>   Components: Store
>     Versions: 1.9
>     Reporter: Karl Wettin
>  Attachments:,,, class_diagram.png, class_diagram.png,
src-1.9karl1_20060611.tar.gz, src.tar.gz, src_20060509.tar.gz
> After fixing the bugs, it's now 4.5 -> 5 times the speed. This is true for both at
index and query time. Sorry if I got your hopes up too much. There are still things to be
done though. Might not have time to do anything with this until next month, so here is the
code if anyone wants a peek.
> Not good enough for Jira yet, but if someone wants to fool around with it, here it is.
The implementation passes a TermEnum -> TermDocs -> Fields -> TermVector comparation
against the same data in a Directory.
> When it comes to features, offsets don't exists and positions are stored ugly and has
> You might notice that norms are float[] and not byte[]. That is me who refactored it
to see if it would do any good. Bit shifting don't take many ticks, so I might just revert
> I belive the code is quite self explaining.
> InstanciatedIndex ii = ..
> InstanciatedIndexReader();
> ii.addDocument(s).. replace IndexWriter for now.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message