jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: [j3] Repository MicroKernel API draft
Date Fri, 10 Jun 2011 13:13:44 GMT

>Why would we ever want to build our own JSON parsing and serialization
>code? Just use one of the existing libraries out there.

This is similar to 'why do you build your own SQL-2 parser and don't use
Lex/Flex/Yacc/Javacc/ANTLR/other parser tool', or 'why do you build your
own cache and don't use Ehcache/other cache libary', or 'why do you build
your own (connection) pooling'. There are multiple reasons:

a) Part of it is JSOP (JSON DIFF) and not JSON. There is no 'standard'
JSOP tokenizer or parser yet, except the one Angela made (within
Jackrabbit - it seems you didn't notice that).

b) A large part of the JSON doesn't need to be fully parsed. The plan is
to store and return the 'raw' JSON, similar to how other modern systems do
ransport-format - values don't need to be de-escaped for storage; instead,
the text is stored 'as is'. Using a full blown parser will unnecessarily
slow down processing.

c) One problem is how to preserve the property type of a value. JSON only
supports very few data types. See also
http://en.wikipedia.org/wiki/JSON#Unsupported_native_data_types - There is
a relatively simple solution which _requires_ that the MicroKernel doesn't
re-format the JSON text: http://en.wikipedia.org/wiki/JSON#cite_note-8

d) Because JSON is so simple, there is simple very very little code to
tokenizing, parsing, and specially generating JSON.

e) The existing JSON parsers I found are simply a pain to use. I want a
tokenizer, not a parser. Not a DOM-style parser that unnecessarily creates
a huge number of little objects. And not a callback/event/handler style
parser where you have to remember the state in some really ugly way. Part
of the current MicroKernel uses org.json.simple.parser.JSONParser, and I
actually find that part of the code painful and ugly.


View raw message