jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: [jr3] Tree model
Date Tue, 28 Feb 2012 15:11:02 GMT
Hi,

> import java.util.Map;
> ... extends Map<String, Tree>

For a low level interface, that would be a lot of methods to be
implemented, with special semantics that might not be desirable (for
example put returns the old data).

Regards,
Thomas





On 2/28/12 4:08 PM, "Thomas Mueller" <mueller@adobe.com> wrote:

>Hi,
>
>What is your use case, meaning where would this interface be used and why?
>
>Regards,
>Thomas
>
>
>On 2/28/12 3:59 PM, "Jukka Zitting" <jukka.zitting@gmail.com> wrote:
>
>>Hi,
>>
>>[Here's me looking at lower-level details for a change.]
>>
>>I was going through the prototype codebases in the sandbox, trying to
>>come up with some clean and simple lowest-common-denominator -style
>>interface for representing content trees. Basically a replacement for
>>the ItemState model in current Jackrabbit.
>>
>>The trouble I find with both the current Jackrabbit ItemState model
>>and the efforts in the sandbox is that this key concept is modeled as
>>concrete classes instead of as interfaces. Using an interface to
>>describe and document the fundamental tree model gives us a lot of
>>extra freedom on the implementation side (lazy loading, decoration,
>>virtual content, etc.).
>>
>>So, how should we go about constructing such an interface? I started
>>by laying some ground rules based on concepts from the sandbox and
>>past jr3 discussions:
>>
>>  * A content tree is composed of a hierachy of items
>>  * Tree items are either leaves or non-leaves
>>  * Non-leaves contain zero or more named child items (*all* other
>>data is stored at leaves)
>>  * Each child item is *uniquely* named within its parent
>>  * Names are just opaque strings
>>  * Leaves contain typed data (strings, numbers, binaries, etc.)
>>  * Content trees are *immutable* except in specific circumstances
>>(transient changes)
>>
>>As a corollary of such a proposed design, the following features (that
>>with a different tree model could be a part of the underlying storage
>>model) would need to be handled as higher level constructs:
>>
>>  * Same-name-siblings (perhaps by name-mangling)
>>  * Namespaces and other name semantics
>>  * Ordering of child nodes (perhaps with a custom order property)
>>  * Path handling
>>  * Identifiers and references
>>  * Node types
>>  * Versioning
>>  * etc., etc.
>>
>>As you can see, it's a very low-level interface I'm talking about.
>>With that background, here's what I'm proposing:
>>https://gist.github.com/1932695 (also included as text at the end of
>>this message). Note that this proposal covers only the interfaces for
>>accessing content (with a big TODO in the Leaf interface). A separate
>>builder or factory interface will be needed for content changes in
>>case this design is adopted.
>>
>>Please criticize, as this is just a quick draft and I'm likely to miss
>>something fairly important. I'm hoping to evolve this to something we
>>could use as a well-documented and thought-of internal abstraction for
>>jr3. Or, if this idea is too broken, to provoke someone to provide a
>>good counter-proposal. :-)
>>
>>BR,
>>
>>Jukka Zitting
>>
>>----
>>
>>import java.util.Map;
>>
>>/**
>> * Trees are the key concept in a hierarchical content repository.
>> * This interface is a low-level tree node representation that just
>> * maps zero or more string names to corresponding child nodes.
>> * Depending on context, a Tree instance can be interpreted as
>> * representing just that tree node, the subtree starting at that node,
>> * or an entire tree in case it's a root node.
>> * <p>
>> * For familiarity and easy integration with existing libraries this
>> * interface extends the generic {@link Map} interface instead of
>> * providing a custom alternative. Note also that this interface is
>> * named Tree instead of something like Item or Node to avoid confusion
>> * with the related JCR interfaces.
>> * </p>
>> *
>> * <h2>Leaves and non-leaves</h2>
>> * <p>
>> * Normal tree nodes only contain structural information expressed as
>> * the set of child nodes and their names. The content of a tree,
>>expressed
>> * in data types like strings, numbers and binaries, is stored in special
>> * leaf nodes with no children. Such leaf nodes implement the {@link
>>Leaf}
>> * sub-interface and can be identified and accessed using the
>> * {@link #isLeaf()} and {@link #asLeaf()} methods.
>> * </p>
>> * <p>
>> * Note that even tough such leaf nodes are guaranteed to have no
>>children
>> * (i.e. {@link #isLeaf()} implies {@link #isEmpty()}), the reverse is
>>not
>> * necessarily true. It's possible for a non-leaf node to contain no
>>children,
>> * though such cases occur normally only transiently when new subtrees
>>are
>> * being constructed.
>> * </p>
>> *
>> * <h2>Mutability and thread-safety</h2>
>> * <p>
>> * Tree objects are immutable by default and thus safe for concurrent
>>access.
>> * Using a mutator method like {@link #clear()} or {@link #put(String,
>>Tree)}
>> * results in an {@link UnsupportedOperationException exception}. A new
>>Tree
>> * instance is needed to express a modified content tree. As a result
>>it's
>> * safe to repeat operations like iterating over all child nodes of a
>>Tree
>> * instance and expect results to be the same.
>> * </p>
>> * <p>
>> * In specific situations like when constructing new trees it's possible
>>for
>> * Tree instances to be mutable. Such cases need to be explicitly
>>documented
>> * and managed in a way that prevents thread-safety issues, for example
>>by
>> * keeping a reference to such a mutable Tree instance local to a single
>> * thread.
>> * </p>
>> *
>> * <h2>Persistence and error-handling</h2>
>> * <p>
>> * A Tree instance can be (and often is) backed by local files or network
>> * resources. All IO operations or related concerns like caching should
>>be
>> * handled transparently below this interface. Potential IO problems and
>> * recovery attempts like retrying a timed-out network access need to be
>> * handled below this interface, and only hard errors should be thrown up
>> * as {@link RuntimeException unchecked exceptions} that higher level
>>code
>> * is not expected to be able to recover from.
>> * </p>
>> * <p>
>> * Since this interface exposes no higher level constructs like access
>> * controls, locking, node types or even path parsing, there's no way
>> * for content access to fail because of such concerns. Such
>>functionality
>> * and related checked exceptions or other control flow constructs should
>> * be implemented on a higher level above this interface.
>> * </p>
>> *
>> * <h2>Decoration and virtual content</h2>
>> * <p>
>> * Not all content exposed by Tree objects needs to be backed by actual
>> * persisted data. An implementation may want to provide provide derived
>> * data like for example the aggregate size of the entire subtree as an
>> * extra virtual leaf node. A virtualization, sharding or caching layer
>> * could provide a composite view over multiple underlying content trees.
>> * Or a basic access control layer could decide to hide certain content
>> * based on specific rules. All such features need to be implemented
>> * according to the API contract of this interface. A separate higher
>>level
>> * interface needs to be used if an implementation can't for example
>> * guarantee immutability of exposed content as discussed above.
>> * </p>
>> */
>>interface Tree extends Map<String, Tree> {
>>
>>    /**
>>     * Checks whether this is a {@link Leaf} instance. Can be used to
>>     * control program flow without explicit <code>instanceof</code>
>>checks
>>     * for handling leaf content. See also the {@link #asLeaf()} method
>>     * that can additionally take care of type casting.
>>     *
>>     * @return <code>true</code> if this is a {@link Leaf},
>>     *         <code>false</code> if not
>>     */
>>    boolean isLeaf();
>>
>>    /**
>>     * Returns this instance as a {@link Leaf} if possible. Can be used
>>     * to access leaf nodes without <code>instanceof</code> checks or
>>     * explicit type casting. A typical access pattern is:
>>     * <pre>
>>     * Leaf leaf = tree.asLeaf();
>>     * if (leaf != null) {
>>     *     // handle leaf content
>>     * } else {
>>     *     // handle non-leaf content
>>     * }
>>     * </pre>
>>     *
>>     * @return this instance as a {@link Leaf},
>>     *         or <code>null</code> if this is a non-leaf node
>>     */
>>    Leaf asLeaf();
>>
>>}
>>
>>/**
>> * Leaves are special {@link Tree} nodes contain typed data like strings,
>> * numbers, binaries, etc. This interface extends {@link Tree} and thus
>> * also {@link Map}, but all Leaf instances are guaranteed to contain
>>zero
>> * child nodes. Leaves are always immutable.
>> */
>>interface Leaf extends Tree {
>>
>>    // TODO: Add data access methods
>>
>>}
>


Mime
View raw message