lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Realtime Search for Social Networks Collaboration
Date Tue, 09 Sep 2008 09:28:09 GMT

Yonik Seeley wrote:

> On Mon, Sep 8, 2008 at 3:04 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>> Right, getCurrentIndex would return a MultiReader that includes
>> SegmentReader for each segment in the index, plus a "RAMReader" that
>> searches the RAM buffer.  That RAMReader is a tiny shell class that  
>> would
>> basically just record the max docID it's allowed to go up to (the  
>> docID as
>> of when it was opened), and stop enumerating docIDs (eg in the  
>> TermDocs)
>> when it hits a docID beyond that limit.
>
> What about something like term freq?  Would it need to count the
> number of docs after the local maxDoc or is there a better way?

Good question...

I think we'd have to take a full copy of the term -> termFreq on  
reopen?  I don't see how else to do it (I don't understand your  
suggestion above).  So, this will clearly add to the cost of reopen.

>> For reading stored fields and term vectors, which are now flushed
>> immediately to disk, we need to somehow get an IndexInput from the
>> IndexOutputs that IndexWriter holds open on these files.  Or,  
>> maybe, just
>> open new IndexInputs?
>
> Hmmm, seems like a case of our nice and simple Directory model not
> having quite enough features in this case.

I think we can simply open IndexInputs on these files.  I believe Java  
does the right thing on windows, such that if we are already writing  
to the file, it does not prevent another file handle from opening the  
file for reading.

>>> Another thing that will help is if users could get their hands on  
>>> the
>>> sub-readers of a multi-segment reader.  Right now that is hidden in
>>> MultiSegmentReader and makes updating anything incrementally
>>> difficult.
>>
>> Besides what's handled by MultiSegmentReader.reopen already, what  
>> else do
>> you need to incrementally update?
>
> Anything that you want to incrementally update and uses an  
> IndexReader as a key.
> Mostly caches I would think... Solr has user-level (application
> specific) caches, faceting caches, etc.

Ahh ok.  We should just open up access and mark this as advanced?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message