couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Fwd: cjson.erl
Date Mon, 07 Jul 2008 19:23:33 GMT
And another round:


Begin forwarded message:

> From: "Joe Armstrong"
> Date: July 7, 2008 9:16:50 PM GMT+02:00
> To: "Jan Lehnardt" <jan@apache.org>
> Cc: "Bob Ippolito"
> Subject: Re: cjson.erl
>
> Re Damien's comments on binaries - they are not *that* ugly
> <<"abc">> instead of "abc". They were ugly a couple of years ago
> before we changed the format.
>
> The important thing to note is that *long* strings should almost
> always be represented as binaries.
>
> What is a long string? it depends, I guess anything more than
> 50 bytes should be a binary) storing (say) xml as a {Tag, [Attrs],  
> Data}
> tree where Data is a string or binary has great implications for
> performance. Basically it doesn't matter how you represent Tag and
> Attrs, but Data should be a binary and NOT a string.
>
> To be on the safe side I'd choose binaries.
>
> Use of atoms should be discouraged - since you don't want to stress
> the atom table (which is not garbed).
>
> As it stands the couchDB distribution has three different modules for
> JSON terms - which might confuse the unwary ...
>
> Cheers
>
> /Joe
>
>
>
>
> On Mon, Jul 7, 2008 at 8:05 PM, Jan Lehnardt <jan@apache.org> wrote:
>> Hello Bob & Joe,
>> here's Damien's take on the JSON issue.
>>
>> Cheers
>> Jan
>> --
>>
>> Begin forwarded message:
>>
>>> From: Damien Katz <damienkatz@gmail.com>
>>> Date: July 7, 2008 7:25:28 PM GMT+02:00
>>> To: couchdb-dev@incubator.apache.org
>>> Subject: Re: cjson.erl
>>> Reply-To: couchdb-dev@incubator.apache.org
>>>
>>> So the sad history of cjson.erl I started with the erlang json  
>>> library I
>>> found on the json.org website (which now appears to be a dead  
>>> link), and
>>> used that for a while. For reasons I cannot remember (bugs or  
>>> performance),
>>> I switched to using the mochiweb json library. However, it used  
>>> slightly
>>> different conventions for using Erlang terms to represent the  
>>> Json. For one
>>> thing, objects were {struct, [...]}, while the json.org library  
>>> used {obj,
>>> [...]}. I think there was one other thing, but I can't remember now.
>>>
>>> Anyway, rather than change all my code to use the new convention,  
>>> I change
>>> the mochi library it to use the json.org conventions and changed  
>>> the name to
>>> cjson.erl (for reason I again cannot remember). Some of the  
>>> comments in the
>>> library are likely wrong because of this.
>>>
>>> Now switching libraries is easy, but switching the Erlang  
>>> respresentation
>>> of json objects is not. However I'd be glad to switch over CouchDB  
>>> to using
>>> a different Erlang representation of json, if there is a "blessed"  
>>> Erlang
>>> format. Otherwise, I'll need practical reasons for doing so.  
>>> Performance is
>>> one such reason.
>>>
>>> One thing I'm no happy about is the idea of representing strings  
>>> using
>>> binaries. From a code asthetics point of view, it uglifies the  
>>> source
>>> dramatically, but I think it might also cause lots of extra  
>>> conversions
>>> between binary strings and normal list strings used in most Erlang  
>>> libraries
>>> and APIs. If the memory and performance improvements will have to  
>>> be big to
>>> make up for the extra complexities in the source.
>>>
>>> -Damien
>>>
>>> On Jul 7, 2008, at 12:24 PM, Jan Lehnardt wrote:
>>>
>>>> Heya,
>>>> Joe Armstrong tries to get the Erlang community to agree
>>>> on a single JSON library that fits everybody's needs. The
>>>> biggest players here (according to Joe I guess) are
>>>> MochiMedia and ourselves.
>>>>
>>>> Hence the dialogue I quote below:
>>>>
>>>> Begin forwarded message:
>>>>
>>>>> From: "Joe Armstrong"
>>>>> Date: July 7, 2008 10:51:07 AM GMT+02:00
>>>>> To: "Jan Lehnardt" <jan@apache.org>
>>>>> Cc: "Bob Ippolito"
>>>>> Subject: cjson.erl
>>>>>
>>>>> Hi Jan,
>>>>>
>>>>> [CC'd to Bob Ippolito (Glad to see the facebook stuff taking off -
>>>>> great work :-)) ]
>>>>>
>>>>> I've been staring at cjson.erl ...
>>>>>
>>>>> The comments say it's derived from mochijson.erl.
>>>>>
>>>>> In the mochiweb there are two json representations
>>>>> mochijson2.erl  and mochijson.erl
>>>>>
>>>>> I think the "2" is the better one :-)
>>>>>
>>>>> I think it would be a good idea if you could come to some  
>>>>> agreement
>>>>> with the mochiweb people as to the best representation of
>>>>> JSON terms in ERlang and both go out with a single library.
>>>>>
>>>>> cjson.erl lacks a type declaration in the documentation - which  
>>>>> it needs
>>>>> (reading the code is hopeless)
>>>>>
>>>>> mochijson2.erl has this type declaration
>>>>>
>>>>> %% @type json_string() = atom | binary()
>>>>> %% @type json_number() = integer() | float()
>>>>> %% @type json_array() = [json_term()]
>>>>> %% @type json_object() = {struct, [{json_string(), json_term()}]}
>>>>> %% @type json_term() = json_string() | json_number() |  
>>>>> json_array() |
>>>>> %%                     json_object()
>>>>>
>>>>> I'm not sure about the additional "struct" tag - nor the  
>>>>> additional
>>>>> atom tag in json_string
>>>>>
>>>>> How about ...
>>>>>
>>>>> @type json_object = {[json_tag::binary(), json_term()]}
>>>>> @type json_string() = binary()
>>>>>
>>>>> this makes the erlang term map to JSON in an unambigous manner and
>>>>> the compiler should be able to generate faster code, since
>>>>>
>>>>> unpack(Json) when is_binary(J) -> ...
>>>>>
>>>>> will only have disjoint branches.
>>>>>
>>>>> I think that:
>>>>>
>>>>> lists should *only* be used for json_arrays
>>>>> binary should *only* be used for json_strings
>>>>> json objs should be *only* be tuples (of pairs)
>>>>> {{Tag,Val},{Tag,Val},...}
>>>>> (possibly {Tag1,Val1,Tag2,Val2,....} might be better???)
>>>>>
>>>>> I think it would be a good idea to isolate this problem - agree
>>>>> (having done some
>>>>> measurements, on the fastest and *prettiest* way to do this) -  
>>>>> jointly
>>>>> change
>>>>> your code bases (at the same time) and then tell the world -  
>>>>> then issue
>>>>> ONE
>>>>> library.
>>>>>
>>>>> Just for fun I've downloaded the wikipedia using the ideas in
>>>>>
>>>>> http://users.softlab.ece.ntua.gr/~ttsiod/ 
>>>>> buildWikipediaOffline.html
>>>>>
>>>>> (I want to converts the XML representation of the wikipedia into  
>>>>> JSON
>>>>> and inject it into coutchDB
>>>>> and serve it up with mochiweb - I need to write a rendering  
>>>>> engine to
>>>>> convert wiki markup to HTML
>>>>> (this is said to be tricky since there is no spec :-)
>>>>>
>>>>> This should be a good test of coutchDB and mochiweb)
>>>>>
>>>>> Cheers
>>>>>
>>>>> /Joe Armstrong
>>>>
>>>>
>>>> And Bob's reply:
>>>>
>>>>> From: "Bob Ippolito"
>>>>> Date: July 7, 2008 6:12:32 PM GMT+02:00
>>>>> To: "Joe Armstrong"
>>>>> Cc: "Jan Lehnardt" <jan@apache.org>
>>>>> Subject: Re: cjson.erl
>>>>>
>>>>> {struct, ...} is what the library that ships with Yaws does,  
>>>>> which is
>>>>> why I used that. Using just {[{Key, Value}]} looks fine to me  
>>>>> also and
>>>>> should be do-able without breaking compatibility immediately.
>>>>>
>>>>> The reason atoms are accepted is only for encoding purposes, not  
>>>>> for
>>>>> decoding. There is an unambiguous format from JSON -> Erlang but 

>>>>> for
>>>>> Erlang -> JSON some conveniences are allowed for practical  
>>>>> reasons.
>>>>>
>>>>> I'm fine with the {struct, ...} -> {...} change that Joe proposed
>>>>> because I can do that in a backwards compatible way.
>>>>>
>>>>> -bob
>>>>
>>>> What is our take on this? :) Damien?
>>>>
>>>> I'll forward our discussions back to Joe and Bob (in case they  
>>>> don't
>>>> read this list).
>>>>
>>>> Cheers
>>>> Jan
>>>>
>>>
>>>
>>
>>
>


Mime
View raw message