lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation
Date Fri, 10 Apr 2009 14:41:15 GMT


Michael McCandless commented on LUCENE-831:

If we are going to allow random access, I like the idea of sticking
with the arrays. They are faster than hiding behind a method, and it
allows easier movement from the old API.

I agree.

It would be nice if we can
still deprecate all of that by backing it with the new impl (as done
with the old patch).

That seems fine?

bq. The current API (from this patch) still looks fairly good to me - a given cachekey gets
your data, and knows how to construct it. You get data something like: return (byte[]) reader.getCachedData(new
ByteCacheKey(field, parser)). It could be improved, but it seems a good start to me.


bq. The immediate problem I see is how to handle multireader vs reader. Not being able to
treat them the same is a real pain. In the segment case, you just want an array back, in the
multi-segment perhaps an array of arrays? Or unsupported? I havn't thought of anything nice.

I would lean towards throwing UOE, and suggesting that you call
getSequentialReaders instead.

Eg with the new getUniqueTermCount() we do that.

bq. We have always been able to customize a lot of behavior with our custom sort types - I
guess the real issue is making the built in sort types customizable. So I guess we need someway
to say, use this "cachekey" for this built in type?

I don't quite follow that last sentence.

We'll have alot of customizability here, ie, if you want to change how
String is parsed to int, if you want to fully override how uninversion
works, etc.  At first the core will only support uninversion as a
source of values, but once CSF is online that should be an alternate
pluggable source, presumably plugging in the same way that
customization would allow you to override uninversion.

bq. When we load the new caches in FieldComparator, can we count on those being segmentreaders?
We can Lucene wise, but not API wise right? Does that matter? I suppose its really tied in
with the multireader vs reader API.

Once getSequentialSubReaders() is called (and, recursively if needed),
then those "atomic" readers should be able to provide values.  I guess
that's the contract we require of a given IndexReader impl?

> Complete overhaul of FieldCache API/Implementation
> --------------------------------------------------
>                 Key: LUCENE-831
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>             Fix For: 3.0
>         Attachments:, fieldcache-overhaul.032208.diff, fieldcache-overhaul.diff,
fieldcache-overhaul.diff, LUCENE-831.03.28.2008.diff, LUCENE-831.03.30.2008.diff, LUCENE-831.03.31.2008.diff,
LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch,
> Motivation:
> 1) Complete overhaul the API/implementation of "FieldCache" type things...
>     a) eliminate global static map keyed on IndexReader (thus
>         eliminating synch block between completley independent IndexReaders)
>     b) allow more customization of cache management (ie: use 
>         expiration/replacement strategies, disk backed caches, etc)
>     c) allow people to define custom cache data logic (ie: custom
>         parsers, complex datatypes, etc... anything tied to a reader)
>     d) allow people to inspect what's in a cache (list of CacheKeys) for
>         an IndexReader so a new IndexReader can be likewise warmed. 
>     e) Lend support for smarter cache management if/when
>         IndexReader.reopen is added (merging of cached data from subReaders).
> 2) Provide backwards compatibility to support existing FieldCache API with
>     the new implementation, so there is no redundent caching as client code
>     migrades to new API.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message