lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: NO_NORMS and TOKENIZED?
Date Wed, 21 Feb 2007 13:52:18 GMT

On Feb 20, 2007, at 11:35 PM, Chris Hostetter wrote:
> the biggest differnece is that the field infos aren't globals, so as
> segments merge and old segments get deleted old data (and field info)
> vanishes into hte ether ... i take advantage of that a lot when  
> planning
> upgrades ... many types of field changes can be made in Solr by  
> deploying
> the schema.xml reindexing slowly over as long as it takes (which  
> the index
> is live and serving queries) then deploying the application changes  
> that
> take advantage of the new field options.

Well, there's no fundamental technical limitation that makes it so  
that global field defs can't be purged -- especially after  
optimization.  I just don't have a system in place that makes it  
possible.  There'd probably be a few bugs if you tried shrinking a  
Schema now, but a few tests should find them and they're killable.

> : The multi-dimensional data problem is the one I'm most interested in
>
> i believe yonik already elaborated on that and gave you some good  
> ideas.
>
> the other situation i brought up in that thread from way back was
> something Solr doesn't currently have a good solution for: one
> field value for display, but other values (in the same field) for
> searching ... likewise indexing one field value for sorting, but  
> storing
> several field values.

In KS, the DocWriter class writes stored fields.  Since the field  
defs have global semantics, you don't need to know anything about the  
contents of the serialized document -- no conflict resolutions.  It's  
just a blob with a certain number of bytes.

My plan is to enable subclassing of DocWriter/DocReader.  That should  
make it possible to scratch a number of itches:

   * multiple values per field
   * different field values for display and searching
   * lazy loading
   * arbitrary data
   * arbitrary compression algo choice
   * complete document recovery
   * ...

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message