jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@adobe.com>
Subject Re: Observation
Date Fri, 08 Nov 2013 21:08:13 GMT
On 08.11.2013, at 02:28, Michael Dürig <mduerig@apache.org> wrote:

> This wouldn't work for journalled observation (OAK-1145) where we'd also 
> need this filtering capabilities. There isn't an event listener involved 
> there where to attach such a filter. So we probably need that new API 
> where some kind of a filter object is passed directly.

Oh, yes, I did not look at journalled observation, probably because I haven't used it on the
app layer yet ;-) It is rarely used, but I expect a few use cases, so we should think about
a similar API here as well (using the same filter definitions to make it seemless).

> On the implementation side we will need a more general, flexible and 
> composeable filter mechanism than we currently have. Due to the low 
> level nature of these filters (working on the node state diff), these 
> filters will most likely be somewhat awkward to use directly. On the API 
> side we should therefore provide a convenient way to compose and inject 
> such filters. Let's use OAK-1133 to follow up on this.

Yes, that's why I think it should be a filter definition and the implementation can then adapt
to. While it would be nice if you could have the listener pass a generic "boolean filter(Node,
...)" method, this would have to be tied to the JCR API or some oak internals like NodeState,
which would be less optimal. I see the new API more as an extension to JCR rather than something
specific to the current Oak.

> This somewhat interact with what we've done on OAK-803: the listening 
> session gets refreshed before observation events are delivered. This was 
> added to make observation more backward compatible. If we add a way to 
> explicitly disable this refreshing the listening session will have 
> access to deleted nodes. Added nodes will then not be visible until an 
> explicit refresh is done.

Great! Then we can actually do both independently: a) support filtering on deleted nodes (which
is what most of the access to deleted nodes in listeners is used for afaics, for the actual
work they might just want to pick up the path of the deleted node and update some other place/cache).
And b) read old/new data in the listener, based on calling refresh() or not.

> I think there are three scenarios for applications:
> 1. Only rely on cluster local events if possible,
> 2. otherwise apply a sufficiently specific filter to not to get swamped,
> 3. come up with a custom solution (i.e. MQ based on an Observer).


> Separate threads where introduced with OAK-1113 to expedite event 
> delivery, which also to increased compatibility with Sling and solved 
> OAK-1084.
> I think we can further improve this by introducing a thread pool. When 
> integrating this with the Whiteboard we would put the deployment into 
> control on weighting observation throughput against resource usage. I 
> wouldn't go so far and kill blocked thread for the same reason Jukka 
> mentioned in his reply. However giving the deployment control over the 
> thread pool makes it possible to use an unbounded pool that starts out 
> with a few threads. Blocking handlers would then lead to higher resource 
> usage while events are still being delivered to non blocked listeners.

Sounds good.

>> I guess it is that reading asynchronously from the immutable
>> NodeStates is more efficient than multiple blocking queues. Which
>> speaks good for the underlying oak implementation :)
> Good to have someone comment who actually got his hands dirty before 
> being smart ;-)

Yes, I totally forgot about the positive effect of threads when you have 8 cpu cores :)

> I wouldn't care about the extra sessions. These should be cheap. And if 
> they aren't we should fix that.

Good. Having something else than the session as basis would actually lead to unnecessary API
and lifecycle management complexity.

> Re. the warning: this should only be 
> logged once per session and is meant as help for migrating to Oak. We 
> can try to further lower the noise if required or remove it entirely.

Maybe those messages can be put into a separate log (in default deployments). Because they
include the stack trace (for a good reason) and occur especially when something is happening,
they make looking at the error.log during debugging an application a bit harder.

> Many parts just recently moved to oak-core (OAK-950). After untangling 
> the dependencies (OAK-1143, ongoing) it might be possible to move the 
> respective parts back to oak-jcr.

I see.

View raw message