Return-Path: Delivered-To: apmail-incubator-couchdb-dev-archive@locus.apache.org Received: (qmail 69325 invoked from network); 7 Jul 2008 21:31:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Jul 2008 21:31:42 -0000 Received: (qmail 70407 invoked by uid 500); 7 Jul 2008 21:31:43 -0000 Delivered-To: apmail-incubator-couchdb-dev-archive@incubator.apache.org Received: (qmail 70371 invoked by uid 500); 7 Jul 2008 21:31:43 -0000 Mailing-List: contact couchdb-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-dev@incubator.apache.org Delivered-To: mailing list couchdb-dev@incubator.apache.org Received: (qmail 70360 invoked by uid 99); 7 Jul 2008 21:31:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Jul 2008 14:31:43 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [83.97.50.139] (HELO jan.prima.de) (83.97.50.139) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Jul 2008 21:30:49 +0000 Received: from [10.0.2.3] (e179143122.adsl.alicedsl.de [::ffff:85.179.143.122]) (AUTH: LOGIN jan, SSL: TLSv1/SSLv3,128bits,AES128-SHA) by jan.prima.de with esmtp; Mon, 07 Jul 2008 21:31:08 +0000 Message-Id: <1593102C-BF91-4F4E-AF32-246AB2F5EDC9@apache.org> From: Jan Lehnardt To: couchdb-dev@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v919.2) Subject: Fwd: cjson.erl Date: Mon, 7 Jul 2008 23:31:07 +0200 References: <6a36e7290807071418l359ca896vce4f39f2f610cce8@mail.gmail.com> X-Mailer: Apple Mail (2.919.2) X-Virus-Checked: Checked by ClamAV on apache.org Bob and Damien agree it seems :) Begin forwarded message: > From: "Bob Ippolito" > Date: July 7, 2008 11:18:29 PM GMT+02:00 > To: "Joe Armstrong" > Cc: "Jan Lehnardt" > Subject: Re: cjson.erl > > Binaries are ugly to type in source code, that's the problem... it's > tedious, that's all. > > On Mon, Jul 7, 2008 at 12:16 PM, Joe Armstrong =20 > wrote: >> Re Damien's comments on binaries - they are not *that* ugly >> <<"abc">> instead of "abc". They were ugly a couple of years ago >> before we changed the format. >> >> The important thing to note is that *long* strings should almost >> always be represented as binaries. >> >> What is a long string? it depends, I guess anything more than >> 50 bytes should be a binary) storing (say) xml as a {Tag, [Attrs], =20= >> Data} >> tree where Data is a string or binary has great implications for >> performance. Basically it doesn't matter how you represent Tag and >> Attrs, but Data should be a binary and NOT a string. >> >> To be on the safe side I'd choose binaries. >> >> Use of atoms should be discouraged - since you don't want to stress >> the atom table (which is not garbed). >> >> As it stands the couchDB distribution has three different modules for >> JSON terms - which might confuse the unwary ... >> >> Cheers >> >> /Joe >> >> >> >> >> On Mon, Jul 7, 2008 at 8:05 PM, Jan Lehnardt wrote: >>> Hello Bob & Joe, >>> here's Damien's take on the JSON issue. >>> >>> Cheers >>> Jan >>> -- >>> >>> Begin forwarded message: >>> >>>> From: Damien Katz >>>> Date: July 7, 2008 7:25:28 PM GMT+02:00 >>>> To: couchdb-dev@incubator.apache.org >>>> Subject: Re: cjson.erl >>>> Reply-To: couchdb-dev@incubator.apache.org >>>> >>>> So the sad history of cjson.erl I started with the erlang json =20 >>>> library I >>>> found on the json.org website (which now appears to be a dead =20 >>>> link), and >>>> used that for a while. For reasons I cannot remember (bugs or =20 >>>> performance), >>>> I switched to using the mochiweb json library. However, it used =20 >>>> slightly >>>> different conventions for using Erlang terms to represent the =20 >>>> Json. For one >>>> thing, objects were {struct, [...]}, while the json.org library =20 >>>> used {obj, >>>> [...]}. I think there was one other thing, but I can't remember =20 >>>> now. >>>> >>>> Anyway, rather than change all my code to use the new convention, =20= >>>> I change >>>> the mochi library it to use the json.org conventions and changed =20= >>>> the name to >>>> cjson.erl (for reason I again cannot remember). Some of the =20 >>>> comments in the >>>> library are likely wrong because of this. >>>> >>>> Now switching libraries is easy, but switching the Erlang =20 >>>> respresentation >>>> of json objects is not. However I'd be glad to switch over =20 >>>> CouchDB to using >>>> a different Erlang representation of json, if there is a =20 >>>> "blessed" Erlang >>>> format. Otherwise, I'll need practical reasons for doing so. =20 >>>> Performance is >>>> one such reason. >>>> >>>> One thing I'm no happy about is the idea of representing strings =20= >>>> using >>>> binaries. =46rom a code asthetics point of view, it uglifies the =20= >>>> source >>>> dramatically, but I think it might also cause lots of extra =20 >>>> conversions >>>> between binary strings and normal list strings used in most =20 >>>> Erlang libraries >>>> and APIs. If the memory and performance improvements will have to =20= >>>> be big to >>>> make up for the extra complexities in the source. >>>> >>>> -Damien >>>> >>>> On Jul 7, 2008, at 12:24 PM, Jan Lehnardt wrote: >>>> >>>>> Heya, >>>>> Joe Armstrong tries to get the Erlang community to agree >>>>> on a single JSON library that fits everybody's needs. The >>>>> biggest players here (according to Joe I guess) are >>>>> MochiMedia and ourselves. >>>>> >>>>> Hence the dialogue I quote below: >>>>> >>>>> Begin forwarded message: >>>>> >>>>>> From: "Joe Armstrong" >>>>>> Date: July 7, 2008 10:51:07 AM GMT+02:00 >>>>>> To: "Jan Lehnardt" >>>>>> Cc: "Bob Ippolito" >>>>>> Subject: cjson.erl >>>>>> >>>>>> Hi Jan, >>>>>> >>>>>> [CC'd to Bob Ippolito (Glad to see the facebook stuff taking =20 >>>>>> off - >>>>>> great work :-)) ] >>>>>> >>>>>> I've been staring at cjson.erl ... >>>>>> >>>>>> The comments say it's derived from mochijson.erl. >>>>>> >>>>>> In the mochiweb there are two json representations >>>>>> mochijson2.erl and mochijson.erl >>>>>> >>>>>> I think the "2" is the better one :-) >>>>>> >>>>>> I think it would be a good idea if you could come to some =20 >>>>>> agreement >>>>>> with the mochiweb people as to the best representation of >>>>>> JSON terms in ERlang and both go out with a single library. >>>>>> >>>>>> cjson.erl lacks a type declaration in the documentation - which =20= >>>>>> it needs >>>>>> (reading the code is hopeless) >>>>>> >>>>>> mochijson2.erl has this type declaration >>>>>> >>>>>> %% @type json_string() =3D atom | binary() >>>>>> %% @type json_number() =3D integer() | float() >>>>>> %% @type json_array() =3D [json_term()] >>>>>> %% @type json_object() =3D {struct, [{json_string(), = json_term()}]} >>>>>> %% @type json_term() =3D json_string() | json_number() | =20 >>>>>> json_array() | >>>>>> %% json_object() >>>>>> >>>>>> I'm not sure about the additional "struct" tag - nor the =20 >>>>>> additional >>>>>> atom tag in json_string >>>>>> >>>>>> How about ... >>>>>> >>>>>> @type json_object =3D {[json_tag::binary(), json_term()]} >>>>>> @type json_string() =3D binary() >>>>>> >>>>>> this makes the erlang term map to JSON in an unambigous manner =20= >>>>>> and >>>>>> the compiler should be able to generate faster code, since >>>>>> >>>>>> unpack(Json) when is_binary(J) -> ... >>>>>> >>>>>> will only have disjoint branches. >>>>>> >>>>>> I think that: >>>>>> >>>>>> lists should *only* be used for json_arrays >>>>>> binary should *only* be used for json_strings >>>>>> json objs should be *only* be tuples (of pairs) >>>>>> {{Tag,Val},{Tag,Val},...} >>>>>> (possibly {Tag1,Val1,Tag2,Val2,....} might be better???) >>>>>> >>>>>> I think it would be a good idea to isolate this problem - agree >>>>>> (having done some >>>>>> measurements, on the fastest and *prettiest* way to do this) - =20= >>>>>> jointly >>>>>> change >>>>>> your code bases (at the same time) and then tell the world - =20 >>>>>> then issue >>>>>> ONE >>>>>> library. >>>>>> >>>>>> Just for fun I've downloaded the wikipedia using the ideas in >>>>>> >>>>>> = http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html >>>>>> >>>>>> (I want to converts the XML representation of the wikipedia =20 >>>>>> into JSON >>>>>> and inject it into coutchDB >>>>>> and serve it up with mochiweb - I need to write a rendering =20 >>>>>> engine to >>>>>> convert wiki markup to HTML >>>>>> (this is said to be tricky since there is no spec :-) >>>>>> >>>>>> This should be a good test of coutchDB and mochiweb) >>>>>> >>>>>> Cheers >>>>>> >>>>>> /Joe Armstrong >>>>> >>>>> >>>>> And Bob's reply: >>>>> >>>>>> From: "Bob Ippolito" >>>>>> Date: July 7, 2008 6:12:32 PM GMT+02:00 >>>>>> To: "Joe Armstrong" >>>>>> Cc: "Jan Lehnardt" >>>>>> Subject: Re: cjson.erl >>>>>> >>>>>> {struct, ...} is what the library that ships with Yaws does, =20 >>>>>> which is >>>>>> why I used that. Using just {[{Key, Value}]} looks fine to me =20 >>>>>> also and >>>>>> should be do-able without breaking compatibility immediately. >>>>>> >>>>>> The reason atoms are accepted is only for encoding purposes, =20 >>>>>> not for >>>>>> decoding. There is an unambiguous format from JSON -> Erlang =20 >>>>>> but for >>>>>> Erlang -> JSON some conveniences are allowed for practical =20 >>>>>> reasons. >>>>>> >>>>>> I'm fine with the {struct, ...} -> {...} change that Joe proposed >>>>>> because I can do that in a backwards compatible way. >>>>>> >>>>>> -bob >>>>> >>>>> What is our take on this? :) Damien? >>>>> >>>>> I'll forward our discussions back to Joe and Bob (in case they =20 >>>>> don't >>>>> read this list). >>>>> >>>>> Cheers >>>>> Jan >>>>> >>>> >>>> >>> >>> >> >> >> >> -- >> fra@fra.se; ingvar.akesson@fra.se >> >> [Kopia av detta meddelande skickas till FRA f=F6r = =F6vervaknings=E4ndam=E5l. >> De vill ju =E4nd=E5 l=E4sa min e-post.] >> >> [A copy of this mail has been sent to >> FRA for monitoring purposes. FRA wants to read all my e-mail and have >> been allowed to do by the Swedish parliment - in violation of article >> 12 of the UN Universal Declaration of Human Rights] >>