jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: Encoding of JCR values in json
Date Thu, 12 Apr 2012 18:15:47 GMT
Hi,

On Thu, Apr 12, 2012 at 7:09 PM, Michael Dürig <michid@gmail.com> wrote:
> We need to come up with an encoding scheme for JCR values in JSON.

Indeed. Thanks for bringing this up!

> While String and Boolean are straight forward double, long and decimal
> are already more troublesome.

As basic rules for handling the latter types I'd define something like this:

* A JSON value is a double if value.equals(Double.valueOf(value).toString())
* A JSON value is a long if value.equals(Long.valueOf(value).toString())
* A JSON value is a decimal if it's a JSON number that matches neither
of the above two rules

That should cover most typical number values in a very natural way.
Cases like a 123 value that's explicitly typed as a decimal instead of
a long should be handled with explicit typing information as discussed
below.

I'd be OK with us explicitly *not* supporting special cases like
infinities and NaN values. We'd just throw ValueFormatExceptions for
them on the oak-jcr level and IllegalArgumentExceptions on the
oak-core (or do something similar). Alternatively we should use
explicit typing information with a well defined syntax for expressing
such special cases.

> Finally for binary, date, name, path, reference, weakreference and uri
> there is no direct correspondence in JSON.

Right. There's an additional constraint for binary values in that the
MicroKernel garbage collector needs some way to connect JSON
properties to referenced binaries. It would be useful if the same
convention was used also higher up the stack.

> The way I solved this in spi2microkernel [1] is by encoding values by
> serializing them to their string representation (Value.getString()) and
> prepend its property type (value.getType) in radix 16 and a colon (:).

Sounds like a workable solution, though I have some reservations:

* The explicit encoding of numeric constants from JCR seems a bit
troublesome and makes potential extensions more cumbersome.

* The overloading of normal strings requires that all string values
will need to be checked for whether they need to be escaped.

An alternative solution would be to use something like the @TypeHint
feature used by the JSON functionality in Sling. Instead of "@", we
should use something like "::" that's invalid in a JCR name to prevent
conflicts. With such a solution the example JSON object would look
like this:

    "example":{
      "long":123,
      "another long":"124",
      "another long::TypeHint":"long",
      "double":"123.4",
      "double::TypeHint":"double",
      "string":"foo",
      "another string":"a:string",
      "another string::TypeHint":"string"
    }

That's a bit verbose, so we could also put the type hint directly into
the relevant property name, like this:

    "example":{
      "long":123,
      "another long::long":"124",
      "double::double":"123.4",
      "string":"foo",
      "another string::string":"a:string"
    }

The main downsides of this approach are:

* Name-based property accesses will potentially need to traverse
through all properties to find a matching name. That should be
manageable since the implementation can pre-scan all property names
and split them to name and type parts.

* There's a potential for conflicts like when a JSON object contains
both "x" and "x::long" properties. That can be dealt with in a commit
validator that prevents such objects from being persisted.

> On a related note: what kind of values do we want to expose from oak-core?
> JSON like or JCR like?

I'd ideally like to keep it JSON-like so we can easily implement a
JavaScript-friendly HTTP mapping directly based on the Oak API without
having to go through extra levels of mapping.

> Implementation wise, would that en/decoding happen inside oak-jcr or oak-core?

I'd put the JSON-JCR type mapping into a shared helper class in
oak-core since it'll be needed by a lot of things like query and node
type handling inside oak-core. But the API interfaces should IMO be
based on JSON types to support cases where JCR typing isn't needed or
wanted.

BR,

Jukka Zitting

Mime
View raw message