lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Realtime Search for Social Networks Collaboration
Date Tue, 09 Sep 2008 16:45:31 GMT

Yonik Seeley wrote:

> On Tue, Sep 9, 2008 at 11:42 AM, Ning Li <> wrote:
>> On Tue, Sep 9, 2008 at 10:02 AM, Yonik Seeley <>  
>> wrote:
>>> Yeah, I think the underlying RandomAccessFile might do the right
>>> thing, but IndexInput isn't required to see any changes on the fly
>>> (and current implementations don't) so at a minimum it would be a
>>> change of IndexInput semantics.  Maybe there would need to be a
>>> refresh() function added, or we would need to require a specific
>>> Directory impl?
>>> OR, if all writes are append-only, perhaps we don't ever need to
>>> invalidate the read buffer and would just need to remove the current
>>> logic that caches the file length and then let the underlying
>>> RandomAccessFile do the EOF checking.
>> We cannot assume it's always RandomAccessFile, can we?
> No, it would essentially be a change in the semantics that all
> implementations would need to support.

Right, which is you are allowed to open an IndexInput on a file when  
an IndexOutput has that same file open and is still appending to it.

>> So we may have to flush after writing each document.
> Flush when creating a new index view (which could possibly be after
> every document is added, but doesn't have to be).

Assuming we can make the above semantics requirement change to  
IndexInput, we don't need to flush on opening a new RAM reader?

>> Even so,
>> this may not be sufficient for some FS such as HDFS... Is it
>> reasonable in this case to keep in memory everything including
>> stored fields and term vectors?
> We could maybe do something like a proxy IndexInput/IndexOutput that
> would allow updating the read buffer from the writer buffer.

Does HDFS disallow a reader from reading a file that's still open for  


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message