lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Future projects
Date Fri, 03 Apr 2009 10:35:59 GMT
On Thu, Apr 2, 2009 at 5:56 PM, Jason Rutherglen
<> wrote:
>> I think I need to understand better why delete by Query isn't
> viable in your situation...
> The delete by query is a separate problem which I haven't fully
> explored yet.

Oh, I had thought we were tugging on this thread in order to explore
delete-by-docID in the writer.  OK.

> Tracking the segment genealogy is really an
> interim step for merging field caches before column stride
> fields gets implemented.

I see -- meaning in Bobo you'd like to manage your own memory resident
field caches, and merge them whenever IW has merged a segment?  Seems
like you don't need genealogy for that.

> Actually CSF cannot be used with Bobo's
> field caches anyways which means we'd need a way to find out
> about the segment parents.

CSF isn't really designed yet.  How come it can't be used with Bobo's
field caches?  We can try to accommodate Bobo's field cache needs when
designing CSF.

>> Does it operate at the segment level? Seems like that'd give
> you good enough realtime performance (though merging in RAM will
> definitely be faster).
> We need to see how Bobo integrates with LUCENE-1483.

Lucene's internal field cache usage is now entirely at the segment
level (ie, Lucene core should never request full field cache array at
the MultiSegmentReader level).  I think Bobo must have to do the same,
if it handles near realtime updates, to get adequate performance.

Though... since we have LUCENE-831 (rework API Lucene exposes for
accessing arrays-of-atomic-types-per-segment) and LUCENE-1231 (CSF = a
more efficient impl (than uninversion) of the API we expose in
LUCENE-831) on deck, we should try to understand Bobo's needs.

EG how come Bobo made its own field cache impl?  Just because
uninversion is too slow?

> It seems like we've been talking about CSF for 2 years and there
> isn't a patch for it? If I had more time I'd take a look. What
> is the status of it?

I think Michael is looking into it?  I'd really like to get it into
2.9.  We should do it in conjunction with 831 since they are so tied.

> I'll write a patch that implements a callback for the segment
> merging such that the user can decide what information they want
> to record about the merged SRs (I'm pretty sure there isn't a
> way to do this with MergePolicy?)

Actually I think you can do this w/ a simple MergeScheduler wrapper or
by subclassing CMS.  I'll put a comment on the issue.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message