lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: Global field semantics
Date Mon, 10 Jul 2006 07:12:17 GMT
Chuck Williams wrote:
> Lucene today allows many field properties to vary at the Field level. 
> E.g., the same field name might be tokenized in one Field on a Document
> while it is untokenized in another Field on the same or different
> Document.

The rationale for this design was to keep the API simple.  I think of it 
like variable declarations: some languages require them and some don't. 
  I opted to make Lucene fields like dynamically-typed variables.  In 
part, Lucene's popularity is due to the simplicity of its API.

However, in my uses of Lucene, most documents have the same fields used 
in the same way, so I don't think I've ever actually taken much 
advantage of this functionality.  It is nice to be able to add a field 
to an index by changing the indexing code in a single place, where the 
field's value is created, and not having to also change the index 
initialization code.  We should try to keep such redundancies out of 
user code.

Thus I would encourage any change in this direction to continue to 
permit fields to be defined lazily, the first time they are added, 
rather than requiring all fields to be declared up front.  Are there 
substantial optimizations that are only possible if all fields are known 
when the index is initialized?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message