From Jukka Zitting <ju...@zitting.name>
Subject Re: Atom node types
Date Wed, 16 Nov 2005 11:50:37 GMT

David Nuescheler wrote:
> thanks for bringing it up and unfortunately i don't have any
> atom nodetypes for you ;)

No problem, perhaps I should design the Atom node types as a public
modelling exercise.

> but this brings me back to an old discussion about what i called
> back then a "public nodetype library". personally, i feel like
> there is still a lot of uncertainty and no best practices or
> guidelines around "nodetype modelling".

Good point! So far I've designed and used a handful of custom node type
sets and I think that every time I've run into some issues that have
required modifications to my original ideas. Extracting such knowledge
would certainly be useful.

> i think peeter wrote a little document on that topic.
> maybe we should have a documentation on nodetype
> modelling including some public sample nodetypes.

I've found the conventions and notation defined by Peeter quite useful.
Would it be possible to publish the document somewhere?

To get things started I just made a quick draft of a possible mapping of
the Atom data model to JCR node types. The node type definitions are
included at the end of this message and based closely on the Atom
syndication format specified in
I've used the notation and naming conventions from Peeter's document.

Some issues I encountered while drafting these node types:

1. Should I map the atom prefix to the official Atom namespace
(http://www.w3.org/2005/Atom), a Jackrabbit namespace
(http://jackrabbit.apache.org/ns/2005/atom), or a personal namespace

2. I defined atom:Base and atom:Node types as abstract base types to
avoid duplicating the item definitions. Should they be mixin node types,
or should I use some mechanism (perhaps a naming convention) to mark
them as abstract types?

3. Because of the query limitations (JCR-247) I decided to flatten some
of the Atom elements into JCR properties. For example the "title" and
"updated" elements are mapped to just STRING and DATE properties. This
makes little difference for the "updated" property, but the "title"
property loses both the type information (text/html/xhtml) and the
potential xml:lang attribute. This seems like a necessary loss to make
the content efficiently searchable (a hard requirement for this use case).

4. I'm planning to store serialized xhtml in the "title", "subtitle",
"summary", and "content" STRING properties. This covers the Atom content
types (plain text and html are easily and mostly losslessly converted to
xhtml, but not the other way around), but could cause problems as also
the xhtml tags will get indexed. A proper alternative would be to use
nt:resource nodes with binary content and correct media types for the
textfilters to work, but JCR-247 again prevents this option. Stroring
xhtml as full node trees seems like overkill for my needs. Another
alternative would be to just lose information by translating all content
to plain text.

5. The current draft is just a straight mapping from the Atom
syndication format, and doesn't fully match the needs of persistent
storage. I'm thinking about perhaps modifying atom:Feed to only contain
the atom:id property and storing the rest of feed information as
atom:Source nodes that would just be linked to atom:Entries retrieved
from those feed instances. I'll make a new draft once I've thought more
about this.

Comments and suggestions are welcome!


Jukka Zitting

Yukatan - http://yukatan.fi/ - info@yukatan.fi
Software craftmanship, JCR consulting, and Java development


[atom:Base]          extends nt:base
  - xml:base         STRING
  - xml:lang         STRING

[atom:Category]      extends atom:Base
  - atom:term        STRING                       mandatory
  - atom:scheme      STRING
  - atom:label       STRING

[atom:Person]        extends atom:Base
  - atom:name        STRING                       mandatory
  - atom:uri         STRING
  - atom:email       STRING

[atom:Generator]     extends atom:Base
  - atom:text        STRING                       mandatory
  - atom:uri         STRING
  - atom:version     STRING

[atom:Link]          extends atom:Base
  - href             STRING                       mandatory
  - rel              STRING
  - type             STRING
  - hreflang         STRING
  - title            STRING
  - length           STRING

[atom:Node]          extends atom:Base
  - atom:rights      STRING
  + atom:author      requiredTypes atom:Person    multiple
  + atom:contributor requiredTypes atom:Person    multiple
  + atom:category    requiredTypes atom:Category  multiple
  + atom:link        requiredTypes atom:Link      multiple
  + atom:generator   requiredTypes atom:Generator

[atom:Source]        extends atom:Node
  - atom:id          STRING
  - atom:icon        STRING
  - atom:logo        STRING
  - atom:title       STRING
  - atom:subtitle    STRING
  - atom:updated     DATE

[atom:Entry]         extends atom:Node, mix:referenceable
  - atom:id          STRING                       mandatory
  - atom:title       STRING                       mandatory
  - atom:summary     STRING
  - atom:type        STRING
  - atom:content     STRING
  - atom:src         STRING
  - atom:published   DATE
  - atom:updated     DATE                         mandatory
  + atom:source      requiredTypes atom:Source

[atom:Feed]          extends atom:Node, mix:referenceable
  - atom:id          STRING                       mandatory
  - atom:icon        STRING
  - atom:logo        STRING
  - atom:title       STRING
  - atom:subtitle    STRING
  - atom:updated     DATE
  + atom:entry       requiredTypes atom:Entry     multiple

