lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <ysee...@gmail.com>
Subject Re: svn commit: r332747 - in /lucene/java/trunk: ./ src/java/org/apache/lucene/search/regex/ src/test/org/apache/lucene/search/regex/
Date Thu, 17 Nov 2005 22:26:14 GMT
Heh. Mid air collision...

On 11/17/05, Doug Cutting <cutting@apache.org> wrote:
> Yonik Seeley wrote:
> > I'm worried about the impact of things like this:
> >  smallfloat(10) + smallfloat(1) + smallfloat(1) + smallfloat(1) -> 10
> >
> > And it makes things very order dependent:
> >  smallfloat(1) + smallfloat(1) + smallfloat(1) + smallfloat(10) -> 12
>
> 10 and 12 are pretty close scores, so while this is clearly not a good
> thing, relevant and irrelevant documents are hopefully separated by more
> than this.

Yeah, I was just hoping to be able to transparently change scorers,
but if order of addition matters, one won't be able to match those
scores.  Maybe it's not so important.

> > Also, epsilon related to the mantissa, not the exponent?
> > That would make it 1/8, not 1/32.
>
> I'm not sure what you're saying.  The current epsilon, with 3-bit
> mantissa, is 1/8, right?  With a five bit mantissa it would go to 1/32, no?

Ahh. my mistake.  I transposed 5 and 3 from your last email (I thought
you were refering to the current norm encoding).

> Right.  Arguably we don't need numbers smaller than 1/100.  A 4-bit
> mantissa with a zero exponent point of 5 gives a minimum value of .0005
> and a max of 2M, plenty of range.  A 5-bit mantissa with zero-exponent
> point of 2 gives us a minimum of .03 and a max of around 2k, nearly the
> desired range, but with greater precision.  In your case above, 10+1+1
> would give 12, moreover 10+.5+.5 would give 11.  I think this is
> probably the best choice.  What do you think?

Hmmm, is .03->2000 really enough range?
Seems like the choice is between that and .0005->2000000 will one less
mantissa bit.
I'm not really sure, but I guess it doesn't have to be decided now...
it's easily changeable.



-Yonik
Now hiring -- http://forms.cnet.com/slink?231706

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message