lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Lazy Field Loading
Date Wed, 29 Mar 2006 13:43:10 GMT
Lazy loaded fields will be a nice addition to Lucene.   I'm curious  
why the flag is set at indexing time rather than it being something  
that is controlled during retrieval somehow.  I'm not sure what that  
API would look like, but it seems its a decision to be addressed  
during searching and reading of an index rather than during indexing  
itself.

	Erik


On Mar 29, 2006, at 8:31 AM, Grant Ingersoll wrote:

> I have a base implementation of lazy field loading that I am  
> starting to test and wanted to run my approach by everyone to hear  
> their thoughts.
>
> I have, as per Doug's suggestion from a while ago, created an  
> interface named Fieldable that is implemented by Field and a new,  
> private class, owned by FieldsReader.  I have introduced an  
> "enumerated" type to the Field class named LazyLoad (which can be  
> YES or NO, in the same spirit as Field.TermVector).  Any place that  
> used to take Field now takes Fieldable.  This should be completely  
> transparent and backward-compatible.  The existing constructors of  
> field all assume lazy to be off.
>
> On creation of a Field, a user can pass in LazyLoad.YES or NO to a  
> constructor that takes either a String value or a byte array (it  
> does not apply to the Reader constructors since they do not store  
> their content).  Indexing and writing of fields take place as  
> normal, the only difference being there is an extra bit added to  
> the field writing that marks the field as being lazy.
>
> On reading in of the field, if it is Lazy, instead of reading in  
> the value for the field and constructing a Field, construct a  
> LazyField instance which takes in the pointer of the fieldsStream  
> and the amount of data to read.  This instance, since it is a  
> private class of FieldsReader, maintains access to the  
> fieldsStream.  Thus, when a application goes to access the value of  
> the field, we check to see if it is has been loaded or not.  If it  
> has not, we load it using the fieldsStream, the pointer and the  
> length to read.
>
> Does anyone see any issues with this?  I think it will only really  
> pay off on large stored fields, but have not quantified it yet.  My  
> main concern is the semantics of the fieldsStream and whether that  
> would be closed behind the back of the LazyField implementation.   
> My understanding is that as long as the IndexReader is open, this  
> stream should also be open.  Is that correct?   What am I  
> forgetting about?
>
> If testing goes well, I should be able to button this up this week  
> or next and submit the patch.
>
> -- 
>
> Grant Ingersoll Sr. Software Engineer Center for Natural Language  
> Processing Syracuse University School of Information Studies 335  
> Hinds Hall Syracuse, NY 13244
> http://www.cnlp.org Voice:  315-443-5484 Fax: 315-443-6886
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message