Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4ECBFE81C for ; Thu, 7 Feb 2013 04:53:17 +0000 (UTC) Received: (qmail 85231 invoked by uid 500); 7 Feb 2013 04:53:16 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 84966 invoked by uid 500); 7 Feb 2013 04:53:16 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 84869 invoked by uid 99); 7 Feb 2013 04:53:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2013 04:53:15 +0000 Date: Thu, 7 Feb 2013 04:53:15 +0000 (UTC) From: "Paul Joseph Davis (JIRA)" To: dev@couchdb.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (COUCHDB-1670) Replicator crashes if numbers in checkpoint docs are expressed in scientific notation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573190#comment-13573190 ] Paul Joseph Davis commented on COUCHDB-1670: -------------------------------------------- Jason and Jens are right here, although I do find it a bit surprising that we actually have an issue here given how erlang treats numbers. My only guess is that we have a guard for is_integer/1 instead of is_number/1 which would badarg on the parsed value (mochijson2 at least would parse that as a float). Couple minor comments on discussion: [~snej] has it right in that we can't expect that JSON will roundtrip byte-for-byte when we have an intermediary translation into an Erlang representation. We already rely on them facts so that we can tell people to sshh when we mutate number representations. [~jhs] Is kinda right but its not just a series of numerals, though its not much more than "looks like a valid number". While the encoding differences aren't quite white space difference levels, they are definitely in below the threshold of what we should tolerate, especially considering what we're using them for. I also have no idea what [~jhs] is talking about with whitespace in the key. If there's truth to that then it sounds like a bug and not just "merely" a json encoding difference. [~jhs] is also quoting Postel's law which is a crock and I have spent much time trying to quash the influence of that terrible idea in the project. The number of times I've gotten pissed trying to remember if its descending=true or reverse=true and checking if I have typos is annoyingly non-zero. [~wohali] is also right in the generic sense that since (hehehe) should not be restricted to a numerical value and if we didn't have what appear to be laten bugs based on that assumption this probably wouldn't even be an issue. And if y'all want to spend more time on this, start investigating round tripping the value 1.1 through a JSON decoder/encoder pair. I'll be here with the tissues when you get to asserting 56bit rounding precisions with the GNU libc strtod assumptions. > Replicator crashes if numbers in checkpoint docs are expressed in scientific notation > ------------------------------------------------------------------------------------- > > Key: COUCHDB-1670 > URL: https://issues.apache.org/jira/browse/COUCHDB-1670 > Project: CouchDB > Issue Type: Bug > Components: Replication > Reporter: Jens Alfke > > The CouchDB 1.2 replicator process crashes with an Erlang exception when parsing a checkpoint document read back from a remote database, if numbers in the document were JSON-encoded in scientific notation instead of as integers. This includes the properties source_last_seq, end_last_seq, start_last_seq. > That is, the following encoding works fine: > ..., "source_last_seq": 1234567, ... > whereas this completely-equivalent encoding causes an exception: > ..., "source_last_seq": 1.234567e+06, ... > This issue raised its head as a result of a CouchDB-compatible engine I'm writing (the Couchbase Sync Gateway) which can serve as a passive replication endpoint. It's implemented in Go, and the Go JSON package has the side effect of (a) parsing all JSON numbers into type 'double', and (b) encoding all doubles into JSON using scientific notation if they're more than six digits long. The net effect is that when CouchDB stores a checkpoint into the Sync Adapter's database and then later reads it back, it barfs due to the scientific notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira