incubator-sling-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Sling > Higher level observation services
Date Thu, 02 May 2013 14:41:01 GMT
<html>
<head>
    <base href="https://cwiki.apache.org/confluence">
            <link rel="stylesheet" href="/confluence/s/2042/9/1/_/styles/combined.css?spaceKey=SLING&amp;forWysiwyg=true"
type="text/css">
    </head>
<body style="background: white;" bgcolor="white" class="email-body">
<div id="pageContent">
<div id="notificationFormat">
<div class="wiki-content">
<div class="email">
    <h2><a href="https://cwiki.apache.org/confluence/display/SLING/Higher+level+observation+services">Higher
level observation services</a></h2>
    <h4>Page <b>edited</b> by             <a href="https://cwiki.apache.org/confluence/display/~bdelacretaz">Bertrand
Delacretaz</a>
    </h4>
        <br/>
                         <h4>Changes (1)</h4>
                                 
    
<div id="page-diffs">
                    <table class="diff" cellpadding="0" cellspacing="0">
    
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >Being able to express these common
observation patterns as higher-level services, if we can do that, would allow for switching
the underlying implementation seamlessly, and would also help promote best practices in how
we use events, by minimizing the amount of code to write at the application level. <br>
<br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">h1.
Tentative API <br>Here&#39;s a first suggestion for an OSGi-friendly API that allows
for taking advantage of the underlying (Oak/JCR/Sling) observation features while staying
independent of the implementation details. <br> <br>This is just a first draft,
needs review. <br> <br>{code:language=Java} <br>/** Event sent when a resource
changes */ <br>interface ResourceEvent { <br>  /** The path of the changed resource
*/ <br>  String getPath(); <br> <br>  /** What happened under that path
(added, modified, <br>   *  removed, moved, multiple changes, etc.) <br>   */
<br>  int getOperation(); <br> <br>  /** In case of a move, supplies the
new path */ <br>  String getNewPath(); <br>} <br> <br>/** Just register
a ResourceObserver OSGi <br> *  service to start observing content <br> */ <br>interface
ResourceObserver { <br>  /** Which absolute paths to listen to. <br>   *  Prepend
each path with R: to listen <br>   *  recursively */ <br>  String [] getPaths();
<br> <br>  /** If regexps are specified, the observer listens <br>   * 
to all paths that can contain them, and considers <br>   *  only items that match. Can
be combined with <br>   *  getPaths() <br>   */ <br>  String [] getPathRegexps();
<br> <br>  /** Options like node/resource types, used <br>   *  by the underlying
implementation if possible. <br>   */ <br>  Map&lt;String, Object&gt;
getSelectionOptions(); <br> <br>  /** Low priority observers might be called after
and less <br>   *  often than high priority ones */ <br>  int getPriority(); <br>
<br>  /** If &gt; 0, onContentChange is not called more often than that */ <br>
 int getAggregationTimeMsec(); <br> <br>  /** Called when the observer is registered,
with info about <br>   *  how the implementation is handling it (couldn&#39;t take
nodetypes <br>   *  into account, etc) */ <br>  void onRegistration(String info);
<br> <br>  /** Called when changes are detected */ <br>  void onContentChange(List&lt;ResourceEvent&gt;
events); <br>} <br>{code} <br> <br></td></tr>
            <tr><td class="diff-unchanged" >h1. Observation usage patterns <br>Let&#39;s
list the frequent patterns that we see w.r.t observing changes in content. <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
    
            </table>
    </div>                            <h4>Full Content</h4>
                    <div class="notificationGreySide">
        <style type="text/css">

   .deflist h4 { margin-top: 0; font-size:100%; font-weight:normal; font-style:italic; color:#888888;
}
  .deflist h4 + p { margin-top: 0;  }
  .deflist p { margin-left: 2em; }
  .deflist ul, .deflist ol { margin-left: 2em }

</style>


<h1><a name="Higherlevelobservationservices-Overview"></a>Overview</h1>
<p>Analyzing how we use observation in our Sling-based apps shows a number of recurring
patterns, described in this page.</p>

<p>Using JCR observation directly or Sling OSGi events does not make a big difference
in the final results, but the implementations are very different. The commit hooks provided
by <a href="http://jackrabbit.apache.org/oak/" class="external-link" rel="nofollow">http://jackrabbit.apache.org/oak/</a>
provide yet another way of observing content changes, which might be more efficient or scalable
in some cases.</p>

<p>Being able to express these common observation patterns as higher-level services,
if we can do that, would allow for switching the underlying implementation seamlessly, and
would also help promote best practices in how we use events, by minimizing the amount of code
to write at the application level.</p>

<h1><a name="Higherlevelobservationservices-TentativeAPI"></a>Tentative
API</h1>
<p>Here's a first suggestion for an OSGi-friendly API that allows for taking advantage
of the underlying (Oak/JCR/Sling) observation features while staying independent of the implementation
details.</p>

<p>This is just a first draft, needs review.</p>

<div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
<pre class="code-java">
/** Event sent when a resource changes */
<span class="code-keyword">interface</span> ResourceEvent {
  /** The path of the changed resource */
  <span class="code-object">String</span> getPath();

  /** What happened under that path (added, modified,
   *  removed, moved, multiple changes, etc.)
   */
  <span class="code-object">int</span> getOperation();

  /** In <span class="code-keyword">case</span> of a move, supplies the <span
class="code-keyword">new</span> path */
  <span class="code-object">String</span> getNewPath();
}

/** Just register a ResourceObserver OSGi
 *  service to start observing content
 */
<span class="code-keyword">interface</span> ResourceObserver {
  /** Which absolute paths to listen to.
   *  Prepend each path with R: to listen
   *  recursively */
  <span class="code-object">String</span> [] getPaths();

  /** If regexps are specified, the observer listens
   *  to all paths that can contain them, and considers
   *  only items that match. Can be combined with
   *  getPaths()
   */
  <span class="code-object">String</span> [] getPathRegexps();

  /** Options like node/resource types, used
   *  by the underlying implementation <span class="code-keyword">if</span> possible.
   */
  Map&lt;<span class="code-object">String</span>, <span class="code-object">Object</span>&gt;
getSelectionOptions();

  /** Low priority observers might be called after and less
   *  often than high priority ones */
  <span class="code-object">int</span> getPriority();

  /** If &gt; 0, onContentChange is not called more often than that */
  <span class="code-object">int</span> getAggregationTimeMsec();

  /** Called when the observer is registered, with info about
   *  how the implementation is handling it (couldn't take nodetypes
   *  into account, etc) */
  void onRegistration(<span class="code-object">String</span> info);

  /** Called when changes are detected */
  void onContentChange(List&lt;ResourceEvent&gt; events);
}
</pre>
</div></div>

<h1><a name="Higherlevelobservationservices-Observationusagepatterns"></a>Observation
usage patterns</h1>
<p>Let's list the frequent patterns that we see w.r.t observing changes in content.</p>

<h2><a name="Higherlevelobservationservices-CachedContent"></a>Cached Content</h2>

<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>In-memory data structures, or "compiled" versions of some content, are created when
content changes.</p>
<h4><a name="Higherlevelobservationservices-Typicaluses"></a>Typical uses</h4>
<p>Configurations, CSS/javascript processing, Sling installer, etc.</p>
<h4><a name="Higherlevelobservationservices-Trigger"></a>Trigger</h4>
<p>Fine-grained detection of changes in a tree of content, based on paths, path regexps,
node types or any other meaningful property.</p>
<h4><a name="Higherlevelobservationservices-Action"></a>Action</h4>
<p>Clear an internal cache that is rebuilt the next time someone needs it.</p>
<h4><a name="Higherlevelobservationservices-Frequency"></a>Frequency</h4>
<p>Content changes are usually not very frequent, for the above typical uses.</p>
<h4><a name="Higherlevelobservationservices-Performancerequirements"></a>Performance
requirements</h4>
<p>Some latency between content changes and processing is usually not a problem.</p>
<h4><a name="Higherlevelobservationservices-Potentialissues"></a>Potential
issues</h4>
<p>N requests coming just after clearing the cache should cause just one cache rebuild,
not N.</p>

<p>Code using this pattern for configurations might rather take advantage of OSGi configurations
managed by the Sling installer.</p>
<h4><a name="Higherlevelobservationservices-Securityconsiderations"></a>Security
considerations</h4>
<p>Loading content in memory with an admin session will make it available to all users.
Loading just the paths of the corresponding content items, and letting users retrieve the
content themselves, avoids this problem.</p></div>

<h2><a name="Higherlevelobservationservices-ContentExport%2CReplicationtoRemoteSystems"></a>Content
Export, Replication to Remote Systems</h2>
<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>Content is exported as a file or pushed to a remote system when it changes.</p></div>

<p>This is similar to the Cached Content pattern.</p>

<h2><a name="Higherlevelobservationservices-ContentIngestion"></a>Content
Ingestion</h2>
<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>Files that are dropped into the repository are parsed or processed, resulting in
content changes and/or workflow events.</p>
<h4><a name="Higherlevelobservationservices-Typicaluses"></a>Typical uses</h4>
<p>Ingesting digital assets, parsing incoming email or other structured files.</p>
<h4><a name="Higherlevelobservationservices-Trigger"></a>Trigger</h4>
<p>A file appears in a watched folder.</p>
<h4><a name="Higherlevelobservationservices-Action"></a>Action</h4>
<p>Process the file, create the corresponding content, move the file to a "processed"
or "rejected" folder.</p>
<h4><a name="Higherlevelobservationservices-Frequency"></a>Frequency</h4>
<p>Depends on the application.</p>
<h4><a name="Higherlevelobservationservices-Performancerequirements"></a>Performance
requirements</h4>
<p>Some latency is usually not a problem, but some applications need to process large
number of files quickly.</p>
<h4><a name="Higherlevelobservationservices-Potentialissues"></a>Potential
issues</h4>
<p>Processing partially saved files too early can be a problem, depending on how files
are added to the repository.</p>
<h4><a name="Higherlevelobservationservices-Securityconsiderations"></a>Security
considerations</h4>
<p>Ingestion folders must be properly secured, and incoming content quarantined unless
it can be proven safe.</p></div>

<h2><a name="Higherlevelobservationservices-ContentTreeReplication"></a>Content
Tree Replication</h2>
<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>Two or more content trees are kept in sync, usually with customizable mappings and
transformations.</p>
<h4><a name="Higherlevelobservationservices-Typicaluses"></a>Typical uses</h4>
<p>Management of federations of websites, which have some common parts and some specific
parts.</p>
<h4><a name="Higherlevelobservationservices-Trigger"></a>Trigger</h4>
<p>Fine-grained detection of changes in a tree of content, based on paths, path regexps,
node types or any other meaningful property.</p>
<h4><a name="Higherlevelobservationservices-Action"></a>Action</h4>
<p>Replicate the source tree to the target tree(s), optionally applying customizable
content transformations.</p>
<h4><a name="Higherlevelobservationservices-Frequency"></a>Frequency</h4>
<p>Source events might be quite frequent depending on authoring activity.</p>
<h4><a name="Higherlevelobservationservices-Performancerequirements"></a>Performance
requirements</h4>
<p>Tree transformations might be costly and usually need to run as background jobs.</p>
<h4><a name="Higherlevelobservationservices-Potentialissues"></a>Potential
issues</h4>
<p>An explosion in the number of application-level and repository-level operations is
possible depending on the shape of the content tree federation and on the frequency of source
content changes.</p></div>

<h2><a name="Higherlevelobservationservices-Aggregationofchanges"></a>Aggregation
of changes</h2>
<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>Collect a number of change events over time and/or for a content subtree, and provide
an aggregated view.</p>
<h4><a name="Higherlevelobservationservices-Typicaluses"></a>Typical uses</h4>
<p>Detect and act on changes to digital assets, without reacting to each and every small
change.</p>
<h4><a name="Higherlevelobservationservices-Trigger"></a>Trigger</h4>
<p>Fine-grained detection of changes in a tree of content, based on paths, path regexps,
node types or any other meaningful property.</p>
<h4><a name="Higherlevelobservationservices-Action"></a>Action</h4>
<p>Store and aggregate events and deliver the results on demand.</p>
<h4><a name="Higherlevelobservationservices-Frequency"></a>Frequency</h4>
<p>Might be quite frequent on a busy content tree.</p>
<h4><a name="Higherlevelobservationservices-Performancerequirements"></a>Performance
requirements</h4>
<p>Aggregating events efficiently, as well as storing them until they're not needed
anymore, can impact performance.</p>
<h4><a name="Higherlevelobservationservices-Potentialissues"></a>Potential
issues</h4>
<p>Might create a noticeable load on the eventing/observation system, if listening to
many detailed events.</p>
<h4><a name="Higherlevelobservationservices-Securityconsiderations"></a>Security
considerations</h4>
<p>Careless aggregation might expose privileged data.</p></div>

<h2><a name="Higherlevelobservationservices-ConsistencyChecksandFixes"></a>Consistency
Checks and Fixes</h2>
<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>Watch specific content subtrees for specific changes (nodes moved etc.) and react
to them to avoid inconsistencies in the content.</p>
<h4><a name="Higherlevelobservationservices-Typicaluses"></a>Typical uses</h4>
<p>Adapt paths that point to other pieces of content when content moves around.</p>
<h4><a name="Higherlevelobservationservices-Trigger"></a>Trigger</h4>
<p>Fine-grained detection of changes in a tree of content, based on paths, path regexps,
node types or any other meaningful property.</p>
<h4><a name="Higherlevelobservationservices-Action"></a>Action</h4>
<p>Modify content to keep it consistent</p>
<h4><a name="Higherlevelobservationservices-Frequency"></a>Frequency</h4>
<p>Usually not very frequent as that's mostly meant to handle edge cases.</p>
<h4><a name="Higherlevelobservationservices-Performancerequirements"></a>Performance
requirements</h4>
<p>A time window during which content can be seen as inconsistent is often unavoidable,
keeping that window small is useful.</p>
<h4><a name="Higherlevelobservationservices-Potentialissues"></a>Potential
issues</h4>
<p>Content must not be modified by JCR listeners, that should happen asynchronously
if using JCR observation.</p></div>

<h2><a name="Higherlevelobservationservices-Workflow%2FJobTrigger"></a>Workflow/Job
Trigger</h2>
<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>Trigger workflows and jobs when content changes.</p>
<h4><a name="Higherlevelobservationservices-Typicaluses"></a>Typical uses</h4>
<p>The unix print queue system is a good example, with folders named <em>incoming</em>,
<em>printing</em>, <em>done</em>, <em>rejected</em> that
store print job definitions.</p>
<h4><a name="Higherlevelobservationservices-Trigger"></a>Trigger</h4>
<p>Detection of new content items in the <em>incoming</em> folder(s).</p>
<h4><a name="Higherlevelobservationservices-Action"></a>Action</h4>
<p>Execute the corresponding tasks and move the job definition nodes according to the
results.</p>
<h4><a name="Higherlevelobservationservices-Frequency"></a>Frequency</h4>
<p>Depends on the application.</p>
<h4><a name="Higherlevelobservationservices-Performancerequirements"></a>Performance
requirements</h4>
<p>Depends on the application.</p>
<h4><a name="Higherlevelobservationservices-Potentialissues"></a>Potential
issues</h4>
<p>Locking must be used if several job processors are competing for <em>incoming</em>
jobs.</p>

<p>Distributed processing of those jobs introduces cluster management requirements.</p></div>

<h2><a name="Higherlevelobservationservices-MessageQueue"></a>Message Queue</h2>
<div class="deflist"><h4><a name="Higherlevelobservationservices-Scenario"></a>Scenario</h4>
<p>Implement a simple message queue backed by a content repository subtree.</p></div>

<p>This is very similar to the Workflow/Job trigger use case: messages are exchanged
between producers and consumers, in the workflow/job trigger case the consumers are job processors.</p>
    </div>
        <div id="commentsSection" class="wiki-content pageSection">
        <div style="float: right;">
            <a href="https://cwiki.apache.org/confluence/users/viewnotifications.action"
class="grey">Change Notification Preferences</a>
        </div>
        <a href="https://cwiki.apache.org/confluence/display/SLING/Higher+level+observation+services">View
Online</a>
        |
        <a href="https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=31820405&revisedVersion=12&originalVersion=11">View
Changes</a>
                |
        <a href="https://cwiki.apache.org/confluence/display/SLING/Higher+level+observation+services?showComments=true&amp;showCommentArea=true#addcomment">Add
Comment</a>
            </div>
</div>
</div>
</div>
</div>
</body>
</html>

Mime
View raw message