incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Fwd: cjson.erl
Date Mon, 07 Jul 2008 21:31:07 GMT
Bob and Damien agree it seems :)

Begin forwarded message:

> From: "Bob Ippolito"
> Date: July 7, 2008 11:18:29 PM GMT+02:00
> To: "Joe Armstrong"
> Cc: "Jan Lehnardt" <jan@apache.org>
> Subject: Re: cjson.erl
>
> Binaries are ugly to type in source code, that's the problem... it's
> tedious, that's all.
>
> On Mon, Jul 7, 2008 at 12:16 PM, Joe Armstrong <joearms@gmail.com>  
> wrote:
>> Re Damien's comments on binaries - they are not *that* ugly
>> <<"abc">> instead of "abc". They were ugly a couple of years ago
>> before we changed the format.
>>
>> The important thing to note is that *long* strings should almost
>> always be represented as binaries.
>>
>> What is a long string? it depends, I guess anything more than
>> 50 bytes should be a binary) storing (say) xml as a {Tag, [Attrs],  
>> Data}
>> tree where Data is a string or binary has great implications for
>> performance. Basically it doesn't matter how you represent Tag and
>> Attrs, but Data should be a binary and NOT a string.
>>
>> To be on the safe side I'd choose binaries.
>>
>> Use of atoms should be discouraged - since you don't want to stress
>> the atom table (which is not garbed).
>>
>> As it stands the couchDB distribution has three different modules for
>> JSON terms - which might confuse the unwary ...
>>
>> Cheers
>>
>> /Joe
>>
>>
>>
>>
>> On Mon, Jul 7, 2008 at 8:05 PM, Jan Lehnardt <jan@apache.org> wrote:
>>> Hello Bob & Joe,
>>> here's Damien's take on the JSON issue.
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>> Begin forwarded message:
>>>
>>>> From: Damien Katz <damienkatz@gmail.com>
>>>> Date: July 7, 2008 7:25:28 PM GMT+02:00
>>>> To: couchdb-dev@incubator.apache.org
>>>> Subject: Re: cjson.erl
>>>> Reply-To: couchdb-dev@incubator.apache.org
>>>>
>>>> So the sad history of cjson.erl I started with the erlang json  
>>>> library I
>>>> found on the json.org website (which now appears to be a dead  
>>>> link), and
>>>> used that for a while. For reasons I cannot remember (bugs or  
>>>> performance),
>>>> I switched to using the mochiweb json library. However, it used  
>>>> slightly
>>>> different conventions for using Erlang terms to represent the  
>>>> Json. For one
>>>> thing, objects were {struct, [...]}, while the json.org library  
>>>> used {obj,
>>>> [...]}. I think there was one other thing, but I can't remember  
>>>> now.
>>>>
>>>> Anyway, rather than change all my code to use the new convention,  
>>>> I change
>>>> the mochi library it to use the json.org conventions and changed  
>>>> the name to
>>>> cjson.erl (for reason I again cannot remember). Some of the  
>>>> comments in the
>>>> library are likely wrong because of this.
>>>>
>>>> Now switching libraries is easy, but switching the Erlang  
>>>> respresentation
>>>> of json objects is not. However I'd be glad to switch over  
>>>> CouchDB to using
>>>> a different Erlang representation of json, if there is a  
>>>> "blessed" Erlang
>>>> format. Otherwise, I'll need practical reasons for doing so.  
>>>> Performance is
>>>> one such reason.
>>>>
>>>> One thing I'm no happy about is the idea of representing strings  
>>>> using
>>>> binaries. From a code asthetics point of view, it uglifies the  
>>>> source
>>>> dramatically, but I think it might also cause lots of extra  
>>>> conversions
>>>> between binary strings and normal list strings used in most  
>>>> Erlang libraries
>>>> and APIs. If the memory and performance improvements will have to  
>>>> be big to
>>>> make up for the extra complexities in the source.
>>>>
>>>> -Damien
>>>>
>>>> On Jul 7, 2008, at 12:24 PM, Jan Lehnardt wrote:
>>>>
>>>>> Heya,
>>>>> Joe Armstrong tries to get the Erlang community to agree
>>>>> on a single JSON library that fits everybody's needs. The
>>>>> biggest players here (according to Joe I guess) are
>>>>> MochiMedia and ourselves.
>>>>>
>>>>> Hence the dialogue I quote below:
>>>>>
>>>>> Begin forwarded message:
>>>>>
>>>>>> From: "Joe Armstrong"
>>>>>> Date: July 7, 2008 10:51:07 AM GMT+02:00
>>>>>> To: "Jan Lehnardt" <jan@apache.org>
>>>>>> Cc: "Bob Ippolito"
>>>>>> Subject: cjson.erl
>>>>>>
>>>>>> Hi Jan,
>>>>>>
>>>>>> [CC'd to Bob Ippolito (Glad to see the facebook stuff taking  
>>>>>> off -
>>>>>> great work :-)) ]
>>>>>>
>>>>>> I've been staring at cjson.erl ...
>>>>>>
>>>>>> The comments say it's derived from mochijson.erl.
>>>>>>
>>>>>> In the mochiweb there are two json representations
>>>>>> mochijson2.erl  and mochijson.erl
>>>>>>
>>>>>> I think the "2" is the better one :-)
>>>>>>
>>>>>> I think it would be a good idea if you could come to some  
>>>>>> agreement
>>>>>> with the mochiweb people as to the best representation of
>>>>>> JSON terms in ERlang and both go out with a single library.
>>>>>>
>>>>>> cjson.erl lacks a type declaration in the documentation - which 

>>>>>> it needs
>>>>>> (reading the code is hopeless)
>>>>>>
>>>>>> mochijson2.erl has this type declaration
>>>>>>
>>>>>> %% @type json_string() = atom | binary()
>>>>>> %% @type json_number() = integer() | float()
>>>>>> %% @type json_array() = [json_term()]
>>>>>> %% @type json_object() = {struct, [{json_string(), json_term()}]}
>>>>>> %% @type json_term() = json_string() | json_number() |  
>>>>>> json_array() |
>>>>>> %%                     json_object()
>>>>>>
>>>>>> I'm not sure about the additional "struct" tag - nor the  
>>>>>> additional
>>>>>> atom tag in json_string
>>>>>>
>>>>>> How about ...
>>>>>>
>>>>>> @type json_object = {[json_tag::binary(), json_term()]}
>>>>>> @type json_string() = binary()
>>>>>>
>>>>>> this makes the erlang term map to JSON in an unambigous manner  
>>>>>> and
>>>>>> the compiler should be able to generate faster code, since
>>>>>>
>>>>>> unpack(Json) when is_binary(J) -> ...
>>>>>>
>>>>>> will only have disjoint branches.
>>>>>>
>>>>>> I think that:
>>>>>>
>>>>>> lists should *only* be used for json_arrays
>>>>>> binary should *only* be used for json_strings
>>>>>> json objs should be *only* be tuples (of pairs)
>>>>>> {{Tag,Val},{Tag,Val},...}
>>>>>> (possibly {Tag1,Val1,Tag2,Val2,....} might be better???)
>>>>>>
>>>>>> I think it would be a good idea to isolate this problem - agree
>>>>>> (having done some
>>>>>> measurements, on the fastest and *prettiest* way to do this) -  
>>>>>> jointly
>>>>>> change
>>>>>> your code bases (at the same time) and then tell the world -  
>>>>>> then issue
>>>>>> ONE
>>>>>> library.
>>>>>>
>>>>>> Just for fun I've downloaded the wikipedia using the ideas in
>>>>>>
>>>>>> http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html
>>>>>>
>>>>>> (I want to converts the XML representation of the wikipedia  
>>>>>> into JSON
>>>>>> and inject it into coutchDB
>>>>>> and serve it up with mochiweb - I need to write a rendering  
>>>>>> engine to
>>>>>> convert wiki markup to HTML
>>>>>> (this is said to be tricky since there is no spec :-)
>>>>>>
>>>>>> This should be a good test of coutchDB and mochiweb)
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> /Joe Armstrong
>>>>>
>>>>>
>>>>> And Bob's reply:
>>>>>
>>>>>> From: "Bob Ippolito"
>>>>>> Date: July 7, 2008 6:12:32 PM GMT+02:00
>>>>>> To: "Joe Armstrong"
>>>>>> Cc: "Jan Lehnardt" <jan@apache.org>
>>>>>> Subject: Re: cjson.erl
>>>>>>
>>>>>> {struct, ...} is what the library that ships with Yaws does,  
>>>>>> which is
>>>>>> why I used that. Using just {[{Key, Value}]} looks fine to me  
>>>>>> also and
>>>>>> should be do-able without breaking compatibility immediately.
>>>>>>
>>>>>> The reason atoms are accepted is only for encoding purposes,  
>>>>>> not for
>>>>>> decoding. There is an unambiguous format from JSON -> Erlang 

>>>>>> but for
>>>>>> Erlang -> JSON some conveniences are allowed for practical  
>>>>>> reasons.
>>>>>>
>>>>>> I'm fine with the {struct, ...} -> {...} change that Joe proposed
>>>>>> because I can do that in a backwards compatible way.
>>>>>>
>>>>>> -bob
>>>>>
>>>>> What is our take on this? :) Damien?
>>>>>
>>>>> I'll forward our discussions back to Joe and Bob (in case they  
>>>>> don't
>>>>> read this list).
>>>>>
>>>>> Cheers
>>>>> Jan
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> fra@fra.se; ingvar.akesson@fra.se
>>
>> [Kopia av detta meddelande skickas till FRA för övervakningsändamål.
>> De vill ju ändå läsa min e-post.]
>>
>> [A copy of this mail has been sent to
>> FRA for monitoring purposes. FRA wants to read all my e-mail and have
>> been allowed to do by the Swedish parliment - in violation of article
>> 12 of the UN Universal Declaration of Human Rights]
>>


Mime
View raw message