couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: [VOTE] Apache CouchDB 1.2.0 release, first round
Date Mon, 13 Feb 2012 10:54:19 GMT
Excellent analysis Paul.

I'd say we go with the patch for 1.2.0 and beyond.

Cheers
Jan
-- 

On Feb 13, 2012, at 07:55 , Paul Davis wrote:

> So yeah. Numbers are hard.
> 
> Firstly, anyone that mentioned RFC 4627 or JavaScript behavior is
> walking down a path entirely orthogonal to the issue at hand. Jason
> almost had it when he talked about them being different but then he
> went off on some weird tangent and lost me.
> 
> In a nutshell, the issue is this:
> 
> CPU's work with bits. Humans (and JSON sorta) work with numbers as a
> string of numerals with some punctuation. This is a lossy conversion.
> 
> So, back to the details.
> 
> COUCHDB-1407 reports that ejson now encodes the value "1.0" as "1".
> While we can wax philosophically about this, the bottom line is that
> this breaks a level of equality. Specifically:
> 
> 1> ejson:decode(ejson:encode(1.0)) =:= 1.0.
> false
> 
> This is precisely because the %g formatting used underneath removes
> trailing zeros and decimal points.
> 
> On the face of it, this is bad. And I agree. There's a simple enough
> fix (and its not what Bob Newson suggested, but I'm going to leave him
> hanging for a bit).
> 
> But, before we get all crazy, we should contemplate a few other fun cases:
> 
> 1. Both mochijson2 and ejson change some number representations
> 
> 5> mochijson2:encode(mochijson2:decode("1E1")) =:= "1E1".
> false
> 6> ejson:encode(ejson:decode("1E1")) =:= "1E1".
> false
> 
> 2. Both mochijson2 and ejson turn numbers with exponents into IEEE-754
> internally
> 
> 7> ejson:decode("1E1") =:= 10.
> false
> 8> mochijson2:decode("1E1") =:= 10.
> false
> 
> 3. Others but I'm tired from staring at math.
> 
> Basically, the end result is that we can match mochijson2's decoder
> damn near identically (At least, I know of no known differences in
> decoding in Jiffy). But now we get to the hard part.
> 
> Mochijson2 does some fancy ass magic for encoding IEEE-754 values. And
> when I say fancy, I mean, implements an algorithm published in some
> random paper from 1996 based on the paper's author's Scheme
> implementation. I spent about twelve hours today trying to duplicate
> before I realized that it depends on having an integral type that can
> represent values with more than 64 bits (which made me sad).
> 
> EIther way, this is dark voodoo. Anyone that's interested can checkout
> mochinum:digits/1 and the supporting functions for some mind bending
> looks into IEEE-754 representations.
> 
> Anyway, bottom line is that 1.0 should be encoded as "1.0". The fix is
> simply to just check for a decimal point and append one if its not
> there. This is what Yajl does and Python appears to behave similarly.
> The patch for Jiffy is at [1] and shows the general idea.
> 
> Also, for those still holding on to why %f is not a valid fix, the
> reason is the same as why %g is wrong (and why it needs to be %0.20g.
> printf and friends by default will round to the sixth decimal places.
> So, 0.123456789 would get encoded as "0.123457" which loses precision.
> 
> Also, with that patch for Jiffy we never lose precision but the
> eyesore is that we encoded 0.1 as "0.10000000000888" (Roughly). Some
> people find that offensive but I don't really care enough to learn
> arbitrary precision math routines so people can have slightly prettier
> JSON. And I say that after having spent all day trying to make it
> work.
> 
> So, yeah. Fix is simple enough.
> 
> Also, food for thought: A JSON parser/serializer pair that converts
> all numbers to 42 is technically compliant with the JSON spec.
> 
> [1] https://github.com/davisp/jiffy/commit/5042cc946008ee413cc66b9b0addcf33ecd2fd93
> 
> On Sat, Feb 11, 2012 at 8:32 AM, Robert Newson <rnewson@apache.org> wrote:
>> I'd like some opinions on whether COUCHDB-1407 constitutes a release
>> blocking issue. Yes, I understand that the JSON spec is very weak on
>> numbers, blah blah boo splat. Is this because of the switch to ejson?
>> Is jiffy more compatible on this score?
>> 
>> For my part, I'm close to considering it a release-blocking
>> regression. At the very least this change should be included at
>> http://wiki.apache.org/couchdb/Breaking_changes#Changes_Between_1.1.0_and_1.2.0
>> but I'd rather it was fixed.
>> 
>> B.
>> 
>> On 11 February 2012 10:44, Benoit Chesneau <bchesneau@gmail.com> wrote:
>>> On Sat, Feb 11, 2012 at 4:00 AM, Jason Smith <jhs@iriscouch.com> wrote:
>>>> On Sat, Feb 11, 2012 at 3:06 AM, Randall Leeds <randall.leeds@gmail.com>
wrote:
>>>>> On Feb 9, 2012 6:09 PM, "Randall Leeds" <randall.leeds@gmail.com>
wrote:
>>>>>> 
>>>>>> On Thu, Feb 9, 2012 at 17:48, Jason Smith <jhs@iriscouch.com>
wrote:
>>>>>>> Hi, Noah. When I saw it hit Git, I realized it was a breaking
change,
>>>>>>> and I asked around. If memory serves, Randall happened to be
on at the
>>>>>>> time and he asked me the same question you just did. I said I
never
>>>>>>> saw an RFC email and that's when he realized it was not done
publicly.
>>>>>> 
>>>>>> I was aware the entire time, but I think the motivation is sound
and
>>>>>> it needed to be done. A couple committers spoke up to say we didn't
>>>>>> think it was sensitive enough to warrant the private discussion but
>>>>>> ultimately there was broad consensus on the implementation and the
>>>>>> change itself. One of those (let us all celebrate) extremely rare
>>>>>> times where there wasn't opportunity for broad community input.
>>>>>> 
>>>>>> Creating a view on _users that pulls the relevant parts of a user
>>>>>> document out is the way forward for public profiles, I think.
>>>>>> If someone would write a blog post showing how that works it'd be
>>>>>> great. In retrospect this would have been a great thing to do weeks
>>>>>> ago. Lesson learned.
>>>>> 
>>>>> Just to be clear I don't want to dismiss your concerns. If you believe
this
>>>>> needs a config option rather than just documentation now is a good time
to
>>>>> speak up loudly since the vote was aborted.
>>>> 
>>>> Thanks. I am concerned. To me, the change is noteworthy but not a showstopper.
>>>> 
>>>> I tested your suggestion, however I do not think it is possible.
>>>> Non-admins cannot access a view.
>>>> 
>>>> $ curlp http://admin:admin@localhost:5984/_users/_design/public -d
>>>> '{"views":{"all":{"map":"function(doc) { emit(doc._id, doc) }"}}}'
>>>> {"ok":true,"id":"_design/public","rev":"1-f605d1ea7825645132f54a91a76a1ddc"}
>>>> 
>>>> $ curl -i http://user:user@localhost:5984/_users/_design/public/_view/all
>>>> HTTP/1.1 403 Forbidden
>>>> Server: CouchDB/1.2.0 (Erlang OTP/R15B)
>>>> Date: Sat, 11 Feb 2012 02:57:43 GMT
>>>> Content-Type: text/plain; charset=utf-8
>>>> Content-Length: 102
>>>> Cache-Control: must-revalidate
>>>> 
>>>> {"error":"forbidden","reason":"Only admins can access design document
>>>> actions for system databases."}
>>>> 
>>> Yes that's by design.
>>> 
>>> - benoƮt


Mime
View raw message