Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@jackrabbit.apache.org
Received-SPF: pass (nike.apache.org: domain of stefan.guggisberg@gmail.com
 designates 209.85.213.42 as permitted sender)
Received-SPF: pass (google.com: domain of stefan.guggisberg@gmail.com
 designates 10.50.181.195 as permitted sender) client-ip=10.50.181.195;
MIME-Version: 1.0
In-Reply-To: 
 <CAOFYJNZG5RAVhwiozUDhBtVrcaSmV4noT65edXR74H9s07QRNw@mail.gmail.com>
References: 
 <CAOFYJNZ5NO23Yg-a3rmOYLLsOPyj-iV1tMd5rOu1dcq7M3pf1Q@mail.gmail.com>
	<CB7A4F60.26CCC%mueller@adobe.com>
	<CAOFYJNZG5RAVhwiozUDhBtVrcaSmV4noT65edXR74H9s07QRNw@mail.gmail.com>
Date: Mon, 5 Mar 2012 15:06:21 +0100
Message-ID: 
 <CAFYk8NknStz8u0+Xe-_6W==qjkFq1xwDmzHDe8ngTPGMJiaxmg@mail.gmail.com>
Subject: Re: [jr3] Tree model
From: Stefan Guggisberg <stefan.guggisberg@gmail.com>
To: dev@jackrabbit.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Mon, Mar 5, 2012 at 12:38 PM, Jukka Zitting <jukka.zitting@gmail.com> wr=
ote:
> Hi,
>
> On Mon, Mar 5, 2012 at 11:25 AM, Thomas Mueller <mueller@adobe.com> wrote=
:
>> If we want to use distinct interfaces for read-only and writable nodes,
>> what about "ImmutableNode" and "MutableNode extends ImmutableNode".
>
> That's troublesome because then a client that's given an ImmutableNode
> can't rely on the instance being immutable (because it could in fact
> be a MutableNode instance).
>
> Anyway, see https://gist.github.com/1977909 (and below) for my latest
> draft of these interfaces. Note also the javadocs.

i am commenting from the mk POV, assuming that the proposed
interface is going to be used below the mk api.

i am generally in favor of your proposal. however, contrary to my earlier
statement, i strongly believe now that the order of child nodes is best
handled on an upper layer, like you originally suggested in this thread.

i.e.

Properties and child nodes are all addressed using an unordered
name->item mapping on the parent node.

what made me change my mind is that the JSON model
doesn't (naturally) support ordered child entries. i've forgot
about that and i guess that mandating order in a JSON-like
model spells trouble sooner or later.

i also realized that the overhead of handling order in an
upper layer (above the mk) is probably not too severe since
it's only required on child node iteration of 'orderable'
nodes. path resolutions e.g. wouldn't be affected.

for getChildNodeEntries() we can therefore drop the distinction
between user-defined and native order and just
state that the iteration order is stable.

cheers
stefan

ps: i am currently off work and can't follow the discussion
in real-time...

>
> This draft is fairly close to the model already present in
> o.a.j.mk.model, with the most crucial difference being that a
> ChildNodeEntry returns a NodeState reference instead of just a content
> id. In other words, the underlying addressing mechanism is hidden
> below this interface.
>
> Note that since we considered it best to decouple the methods for
> accessing properties and child nodes, i.e. have getProperty(String)
> and getNode(String) instead of just a getItem(String), it actually
> makes sense to have a getName() method on the PropertyState instance.
> Otherwise a separate PropertyEntry interface would be needed in order
> to avoid unnecessary extra string lookups when iterating over all
> properties. The desire to avoid extra lookups is also why I'd rather
> use a separate interface for properties instead of adding extra
> methods to NodeState.
>
> BR,
>
> Jukka Zitting
>
>
> /**
> =A0* A content tree consists of nodes and properties, each of which
> =A0* evolves through different states during its lifecycle. This interfac=
e
> =A0* represents a specific, immutable state of a node in a content tree.
> =A0* Depending on context, a NodeState instance can be interpreted as
> =A0* representing the state of just that node, of the subtree starting at
> =A0* that node, or of an entire tree in case it's a root node.
> =A0* <p>
> =A0* The crucial difference between this interface and the similarly name=
d
> =A0* class in Jackrabbit 2.x is that this interface represents a specific=
,
> =A0* immutable state of a node, whereas the Jackrabbit 2.x class represen=
ted
> =A0* the "current" state of a node.
> =A0*
> =A0* <h2>Properties and child nodes</h2>
> =A0* <p>
> =A0* A node consists of an unordered set of properties, and an ordered se=
t
> =A0* of child nodes. Each property and child node is uniquely named and a
> =A0* single name can only refer to a property or a child node, not both a=
t
> =A0* the same time.
> =A0*
> =A0* <h2>Immutability and thread-safety</h2>
> =A0* <p>
> =A0* As mentioned above, all node and property states are always immutabl=
e.
> =A0* Thus repeating a method call is always guaranteed to produce the sam=
e
> =A0* result as before unless some internal error occurs (see below). Note
> =A0* however that this immutability only applies to a specific state inst=
ance.
> =A0* Different states of a node can obviously be different, and in some c=
ases
> =A0* even different instances of the same state may behave slightly diffe=
rently.
> =A0* For example due to performance optimization or other similar changes=
 the
> =A0* iteration order of properties may be different for two instances of =
the
> =A0* same node state. However, all such changes must file
> =A0* <p>
> =A0* In addition to being immutable, a specific state instance guaranteed=
 to
> =A0* be fully thread-safe. Possible caching or other internal changes nee=
d to
> =A0* be properly synchronized so that any number of concurrent clients ca=
n
> =A0* safely access a state instance.
> =A0*
> =A0* <h2>Persistence and error-handling</h2>
> =A0* <p>
> =A0* A node state can be (and often is) backed by local files or network
> =A0* resources. All IO operations or related concerns like caching should=
 be
> =A0* handled transparently below this interface. Potential IO problems an=
d
> =A0* recovery attempts like retrying a timed-out network access need to b=
e
> =A0* handled below this interface, and only hard errors should be thrown =
up
> =A0* as {@link RuntimeException unchecked exceptions} that higher level c=
ode
> =A0* is not expected to be able to recover from.
> =A0* <p>
> =A0* Since this interface exposes no higher level constructs like access
> =A0* controls, locking, node types or even path parsing, there's no way
> =A0* for content access to fail because of such concerns. Such functional=
ity
> =A0* and related checked exceptions or other control flow constructs shou=
ld
> =A0* be implemented on a higher level above this interface.
> =A0*
> =A0* <h2>Decoration and virtual content</h2>
> =A0* <p>
> =A0* Not all content exposed by this interface needs to be backed by actu=
al
> =A0* persisted data. An implementation may want to provide provide derive=
d
> =A0* data like for example the aggregate size of the entire subtree as an
> =A0* extra virtual property. A virtualization, sharding or caching layer
> =A0* could provide a composite view over multiple underlying content tree=
s.
> =A0* Or a basic access control layer could decide to hide certain content
> =A0* based on specific rules. All such features need to be implemented
> =A0* according to the API contract of this interface. A separate higher l=
evel
> =A0* interface needs to be used if an implementation can't for example
> =A0* guarantee immutability of exposed content as discussed above.
> =A0*/
> public interface NodeState {
>
> =A0 =A0/**
> =A0 =A0 * Returns the named property. The name is an opaque string and
> =A0 =A0 * is not parsed or otherwise interpreted by this method.
> =A0 =A0 * <p>
> =A0 =A0 * The namespace of properties and child nodes is shared, so if
> =A0 =A0 * this method returns a non-<code>null</code> value for a given
> =A0 =A0 * name, then {@link #getChildNode(String)} is guaranteed to retur=
n
> =A0 =A0 * <code>null</code> for the same name.
> =A0 =A0 *
> =A0 =A0 * @param name name of the property to return
> =A0 =A0 * @return named property, or <code>null</code> if not found
> =A0 =A0 */
> =A0 =A0PropertyState getProperty(String name);
>
> =A0 =A0/**
> =A0 =A0 * Returns an iterable of the properties of this node. Multiple
> =A0 =A0 * iterations are guaranteed to return the properties in the same
> =A0 =A0 * order, but the specific order used is implementation-dependent
> =A0 =A0 * and may change across different states of the same node.
> =A0 =A0 *
> =A0 =A0 * @return properties in some stable order
> =A0 =A0 */
> =A0 =A0Iterable<PropertyState> getProperties();
>
> =A0 =A0/**
> =A0 =A0 * Returns the named child node. The name is an opaque string and
> =A0 =A0 * is not parsed or otherwise interpreted by this method.
> =A0 =A0 * <p>
> =A0 =A0 * The namespace of properties and child nodes is shared, so if
> =A0 =A0 * this method returns a non-<code>null</code> value for a given
> =A0 =A0 * name, then {@link #getProperty(String)} is guaranteed to return
> =A0 =A0 * <code>null</code> for the same name.
> =A0 =A0 *
> =A0 =A0 * @param name name of the child node to return
> =A0 =A0 * @return named child node, or <code>null</code> if not found
> =A0 =A0 */
> =A0 =A0NodeState getChildNode(String name);
>
> =A0 =A0/**
> =A0 =A0 * Returns the number of child nodes of this node.
> =A0 =A0 *
> =A0 =A0 * @return number of child nodes
> =A0 =A0 */
> =A0 =A0int getChildNodeCount();
>
> =A0 =A0/**
> =A0 =A0 * Returns an iterable of the child node entries starting from the
> =A0 =A0 * given offset and containing the given number of entries. The or=
der
> =A0 =A0 * of child nodes is normally as specified by the client that crea=
ted
> =A0 =A0 * or reordered them.
> =A0 =A0 * <p>
> =A0 =A0 * The order of child nodes is by default as specified by the
> =A0 =A0 * client that created or reordered them, but the caller can also
> =A0 =A0 * ask the underlying implementation to return nodes in their
> =A0 =A0 * native order that may be more efficient to iterate over.
> =A0 =A0 * To request such native ordering, the caller should specify
> =A0 =A0 * the offset parameter in ones' complement form
> =A0 =A0 * (i.e. <code>~offset</code>).
> =A0 =A0 * <p>
> =A0 =A0 * If the requested range is completely or partially beyond the nu=
mber
> =A0 =A0 * of child nodes of this node, then only those child nodes that m=
atch
> =A0 =A0 * the range are returned. Thus the returned iterable may contain =
less
> =A0 =A0 * than the requested number of entries.
> =A0 =A0 *
> =A0 =A0 * @param offset start offset from which to return entries;
> =A0 =A0 * =A0 =A0 =A0 =A0 =A0 =A0 =A0 with <code>0</code> being the offse=
t of the first entry,
> =A0 =A0 * =A0 =A0 =A0 =A0 =A0 =A0 =A0 and negative offsets interpreted as=
 described above
> =A0 =A0 * @param length maximum number of entries to return;
> =A0 =A0 * =A0 =A0 =A0 =A0 =A0 =A0 =A0 use <code>-1</code> to return all r=
emaining entries
> =A0 =A0 * @return requested child node entries
> =A0 =A0 */
> =A0 =A0Iterable<ChildNodeEntry> getChildNodeEntries(int offset, int lengt=
h);
>
> }
>
>
>
> /**
> =A0* TODO: document
> =A0*/
> public interface PropertyState {
>
> =A0 =A0/**
> =A0 =A0 * TODO: document
> =A0 =A0 */
> =A0 =A0String getName();
>
> =A0 =A0/**
> =A0 =A0 * FIXME: replace with type-specific accessors
> =A0 =A0 */
> =A0 =A0String getEncodedValue();
>
> }
>
>
>
> /**
> =A0* TODO: document
> =A0*/
> public interface ChildNodeEntry {
>
> =A0 =A0/**
> =A0 =A0 * TODO: document
> =A0 =A0 */
> =A0 =A0String getName();
>
> =A0 =A0/**
> =A0 =A0 * TODO: document
> =A0 =A0 */
> =A0 =A0NodeState getNode();
>
> }