lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <>
Subject [jira] Updated: (LUCENE-831) Complete overhaul of FieldCache API/Implementation
Date Fri, 28 Mar 2008 23:14:24 GMT


Mark Miller updated LUCENE-831:

    Attachment: LUCENE-831.03.28.2008.diff

Here is a quick proof-of-concept type patch for using a method call rather than arrays. Speed
pertaining to reopen.

In my quick test of 'open 500000 tiny docs index, repeat(3): add couple docs/sort search'
the total time taken was:

Orig FieldCache impl: 27 seconds
New impl with arrays: 12 seconds
New impl with method call: 3 seconds

Its kind of a worse case scenerio, but much faster is much faster<g> The bench does
not push through the point where method 3 would have to reload all of the segments, so that
would affect it some...but method one is reloading all of the segments every single time...

This approach keeps the original approach for those that want to use the arrays. In that case
everything still merges except for the StringIndex, so String sorting is slow. Lucene core
is rigged to use the new method call though, so String sort is as sped up as the other field
types when not using the arrays.

Not sure everything is completely on the level yet, but all core tests pass (core sort tests
can miss a lot).

I lied about changing all core to use the new api...I havn't changed the function package

- Mark

> Complete overhaul of FieldCache API/Implementation
> --------------------------------------------------
>                 Key: LUCENE-831
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>            Assignee: Michael Busch
>             Fix For: 2.4
>         Attachments: fieldcache-overhaul.032208.diff, fieldcache-overhaul.diff, fieldcache-overhaul.diff,
> Motivation:
> 1) Complete overhaul the API/implementation of "FieldCache" type things...
>     a) eliminate global static map keyed on IndexReader (thus
>         eliminating synch block between completley independent IndexReaders)
>     b) allow more customization of cache management (ie: use 
>         expiration/replacement strategies, disk backed caches, etc)
>     c) allow people to define custom cache data logic (ie: custom
>         parsers, complex datatypes, etc... anything tied to a reader)
>     d) allow people to inspect what's in a cache (list of CacheKeys) for
>         an IndexReader so a new IndexReader can be likewise warmed. 
>     e) Lend support for smarter cache management if/when
>         IndexReader.reopen is added (merging of cached data from subReaders).
> 2) Provide backwards compatibility to support existing FieldCache API with
>     the new implementation, so there is no redundent caching as client code
>     migrades to new API.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message