lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: NO_NORMS and TOKENIZED?
Date Sat, 17 Feb 2007 01:21:26 GMT

On Feb 15, 2007, at 1:20 PM, Yonik Seeley wrote:

> I originally added it without an Index param at all.
> I can't say I'm a fan of the way Field currently does things, and I
> didn't want everyone to pay the price for yet more options.

Solved... with fixed field definitions.

http://www.rectangular.com/kinosearch/docs/devel/KinoSearch/Schema.html
http://www.rectangular.com/kinosearch/docs/devel/KinoSearch/Schema/ 
FieldSpec.html
http://www.rectangular.com/kinosearch/docs/devel/KinoSearch/ 
InvIndex.html

> I'm probably in the minority, hence I never said
> anything about it before.

Heh.  Your compadre Hoss is one of the most ardent opponents of fixed  
field defs.  :)

I've been planning to ask him at some point how he might deal with  
multidimensional data if he were forced to use KinoSearch's Schema/ 
FieldSpec system.  What say you, Hoss, can you think of something?

For KS, a solution to that problem would be nice to have, but not  
essential.  KS has never allowed dynamic field defs, so there's no  
constituency wedded to them.  And fixed field defs solve so many more  
problems than they cause.

Imagine a world with no search-time/index-time analyzer mismatches...

> How would I have handled it?
>  With a single Field class, I would probably have used old-fashion
> c-style flags (a bit field).  Nice an extensible (you can add new
> flags without adding/changing any APIs, no performance impact to
> adding new options, you can pass around all these flags as a unit,
> check multiple flags with a single instruction, etc.

I don't think you can convey enough information using only flags.

Looking ahead, it's easy to specify a "flexible indexing" posting  
format by adding that property to a FieldSpec via a method or an  
inner class, but hard to see how you'd put that amount of info in a  
bit -- let alone resolve conflicts.

People want to attach greater and greater semantic meaning to field  
names.  The present system doesn't scale well, as you've pointed out;  
there will be ever-increasing tension until it is modified.

> I guess my short answer is that I don't have an
> opinion on adding another type-safe constant TOKENIZED_NO_NORMS
> because I don't like the whole scheme.

KS 0.20 doesn't even have Document or Field classes.  :)  They've  
been eliminated, and native Perl hashes are now used to transport  
document data.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message