incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: Silent corruption of large numbers
Date Wed, 11 Nov 2009 08:11:55 GMT
On Tue, Nov 10, 2009 at 04:23:07PM -0800, Roger Binns wrote:
> I looked up a few Javascript tutorials and didn't find a single one stating
> that all numbers are stored as float.

Ah, you can't believe everything you read on the Internet :-)

I have "Javascript: The Definitive Guide" which says very early on (p22)
that all Numbers are floating-point, and this differs from languages like C
and Java. I'd certainly recommend this book.

"Javascript: The Good Bits" looks good too, but is much slimmer as it talks
only about the language and not all the in-browser APIs, and I don't have a
copy of that one.

> You can even see this in Javascript documentation itself.  It will say that
> parseInt returns an integer and parseFloat returns a floating point again
> implying two different types.

ECMA-262 is very clear that there is only one Number type (section 4.3.20).
The introduction to parseInt does talk about making an integer 'value', but
it's clear that it can be rounded if necessary:

"Compute the mathematical integer value that is represented by Z in radix-R
notation, using the letters A-Z and a-z for digits with values 10 through
35. (However, if R is 10 and Z contains more than 20 significant digits,
every significant digit after the 20th may be replaced by a 0 digit, at the
option of the implementation; and if R is not 2, 4, 8, 10, 16, or 32, then
Result(16) may be an implementation-dependent approximation to the
mathematical integer value that is represented by Z in radix-R notation.)"

> I am also willing to bet that this is not widely known.

I'm not sure that's true in general (anyone working with large numbers
inside a browser would know), but if your clients don't, then you can
educate them.

> > Even if you could raise an exception, there currently isn't a good mechanism
> > to handle it. If a map function bombs out, all you get is an error in the
> > log and that document disappears from the view entirely. There's nothing
> > reported back to the view user in any form.
> 
> I assume the log isn't available via REST either

Actually it is, but I would prefer not to rely on "sysadmin" features being
available to normal users. I would block such things in a proxy. The log is
global, rather than per-database.

> Perhaps view results
> need to include an "errors" key with an integer of how many occurred in
> generating the view.

Yes, I think something like this would be a good idea (show the number of
documents with view generation errors, and ideally which documents they were
and the errors generated - perhaps by having a reserved bit of key space
where these errors can be emitted)

> Translation: there are limits :)

Indeed there are. But apart from the fixed 2/4GB limit for 32-bit machines,
there are variable practical limits which would be different on a machine
with 256MB of RAM versus a machine with 16GB of RAM.

However, I still say that you would be ill-advised to use a JSON document of
anything like this size with couchdb, since every time you read or modify it
you'd have to stream the *whole* thing over HTTP. Translation: such things
become impractical before they actually fail.

Regards,

Brian.

Mime
View raw message