lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Lazy Field Loading
Date Wed, 29 Mar 2006 13:31:47 GMT
I have a base implementation of lazy field loading that I am starting to 
test and wanted to run my approach by everyone to hear their thoughts.

I have, as per Doug's suggestion from a while ago, created an interface 
named Fieldable that is implemented by Field and a new, private class, 
owned by FieldsReader.  I have introduced an "enumerated" type to the 
Field class named LazyLoad (which can be YES or NO, in the same spirit 
as Field.TermVector).  Any place that used to take Field now takes 
Fieldable.  This should be completely transparent and 
backward-compatible.  The existing constructors of field all assume lazy 
to be off.

On creation of a Field, a user can pass in LazyLoad.YES or NO to a 
constructor that takes either a String value or a byte array (it does 
not apply to the Reader constructors since they do not store their 
content).  Indexing and writing of fields take place as normal, the only 
difference being there is an extra bit added to the field writing that 
marks the field as being lazy.

On reading in of the field, if it is Lazy, instead of reading in the 
value for the field and constructing a Field, construct a LazyField 
instance which takes in the pointer of the fieldsStream and the amount 
of data to read.  This instance, since it is a private class of 
FieldsReader, maintains access to the fieldsStream.  Thus, when a 
application goes to access the value of the field, we check to see if it 
is has been loaded or not.  If it has not, we load it using the 
fieldsStream, the pointer and the length to read.

Does anyone see any issues with this?  I think it will only really pay 
off on large stored fields, but have not quantified it yet.  My main 
concern is the semantics of the fieldsStream and whether that would be 
closed behind the back of the LazyField implementation.  My 
understanding is that as long as the IndexReader is open, this stream 
should also be open.  Is that correct?   What am I forgetting about?

If testing goes well, I should be able to button this up this week or 
next and submit the patch.

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message