lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] Lucy questions wrt production, ranking, etc
Date Sun, 11 Sep 2011 20:13:52 GMT
On Sat, Sep 10, 2011 at 06:50:52PM +0200, goran kent wrote:
> Hmm, my sort floats are typically like this:
> 
> min:     0.1254619125
> max:    3117.88289166118
> typical: 0.231372871201865  (15 dec places max)
> 
> sprintf("%0.15f") in Perl produces:
> 
> 0.125461912500000  ok
> 3117.882891661180111 screw up (should be 3117.882891661180000)
> 0.231372871201865 ok
> 
> I imagine the large value 3117 with the added "0111" won't present a
> problem when sorting -- but I have a feeling I'm wrong, right?

;)

Sure, add the value "8.8" to the mix and you're in trouble, because it will
sort last:

  $ perl -MData::Dumper -le 'my @stuff = ("0.125461912500000", "3117.882891661180111", "0.231372871201865",
"8.8"); warn Dumper([sort @stuff])' 
  $VAR1 = [
            '0.125461912500000',
            '0.231372871201865',
            '3117.882891661180111',
            '8.8'
          ];

You need to change the format argument from "%0.15f" to something like
"%023.15f" to get a minimum number of total characters (the decimal point is
included in the count) and to fill with leading zeros.

  $ perl -MData::Dumper -le 'my @stuff = map {sprintf "%023.15f", $_} ("0.125461912500000",
"3117.882891661180111", "0.231372871201865", "8.8"); warn Dumper([sort @stuff])' 
  $VAR1 = [
            '0000000.125461912500000',
            '0000000.231372871201865',
            '0000008.800000000000001',
            '0003117.882891661180111'
          ];

Watch out, though -- you have to know the range in advance in order to choose
the format so that you don't overflow.  Effectively what we're doing is
converting a floating point representation to "fixed point", and fixed point
cannot represent the same range as floating point without occupying a lot more
space.

Marvin Humphrey


Mime
View raw message