jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mete Atamel <mata...@adobe.com>
Subject Re: RepositoryStatistics question
Date Wed, 22 Feb 2012 16:17:21 GMT
To follow up on this topic, I have another question on extensibility. Right now, Jackrabbit
reports TimeSeries for things like BUNDLE_READ_COUNTER, BUNDLE_WRITE_COUNTER, etc. but what
if I want to extend Jackrabbit and report TimeSeries for something new? As far as I can tell,
there's no way to extend Jackrabbit to include TimeSeries for additional properties. That's
because the type of TimeSeries are defined in RepositoryStatistics class as Type enum. Enums
in Java cannot be extended. So, any suggestions on how additional TimeSeries could be defined?
If this is not possible, I'm wondering if there's any interest to modify Jackrabbit to include
TimeSeries extensibility?

-Mete

From: Adobe Systems <matamel@adobe.com<mailto:matamel@adobe.com>>
Date: Thu, 16 Feb 2012 06:56:30 -0800
To: "dev@jackrabbit.apache.org<mailto:dev@jackrabbit.apache.org>" <dev@jackrabbit.apache.org<mailto:dev@jackrabbit.apache.org>>
Subject: Re: RepositoryStatistics question

Hi Alex, thanks for the feedback.

I think the two options that you outlined both assume that there are updates at least every
second. In that case, as you said, you have the option of pushing info to TimeSeries as actual
updates happen (which could be more frequent than every second) or you could aggregate updates
into a number and push the aggregate number every second.

However, my point was more about the case when there are updates less than every second, say
every 2 seconds. In that case,  you'd see TimeSeries like 100, 0, 40, 0, … and the problem
with that is it makes you think that the actual bundle count is 100, then it goes down to
0, when it actually means that at time 1, bundle count is 100 but at time 2, bundle count
didn't change, I.e. Still 100.

This resetting every second is fine for things like SESSION_READ_COUNTER where we actually
care about the frequency of session reads and frequency by its definition is per second. However,
for things like session count, bundle count, bundle size, we care about a number that increases
or decreases but it's not frequency (I.e. Shouldn't reset every second).

SESSION_COUNT got this correctly (doesn't update every second) and I do think that BUNDLE_COUNTER
and BUNDLE_WS_SIZE_COUNTER should do the same because they represent numbers not frequencies.
Otherwise, the implementations have to update them every second with the same values in order
to avoid getting zeros and that's unnecessary IMO.

-Mete

From: Alex Parvulescu <alex.parvulescu@gmail.com<mailto:alex.parvulescu@gmail.com>>
Reply-To: "dev@jackrabbit.apache.org<mailto:dev@jackrabbit.apache.org>" <dev@jackrabbit.apache.org<mailto:dev@jackrabbit.apache.org>>
Date: Thu, 16 Feb 2012 06:15:37 -0800
To: "dev@jackrabbit.apache.org<mailto:dev@jackrabbit.apache.org>" <dev@jackrabbit.apache.org<mailto:dev@jackrabbit.apache.org>>
Subject: Re: RepositoryStatistics question

Hi Mete,

The answer depends on the implementation, and as you can see on the wiki page [0] these two
counters have not been implemented yet :)

You have 2 options here:

1) is what you listed above: you provide an incremental implementation where each time there
is a change to the number of bundles, you push that info to the counter via the #addAndGet(long
delta) method.
In this case, yes we should change to non-incremental.

or

2) similar to BUNDLE_CACHE_SIZE_COUNTER you provide the absolute number each time via #set(long
newValue), possibly at a lower frequency that the one at the actual updates come in.

So, as I've said it depends on the actual implementation and not having one yet means that
if you find it easier the other way around (#1) we can switch without breaking anything.


best,
alex

[0] http://wiki.apache.org/jackrabbit/Statistics


On Thu, Feb 16, 2012 at 2:20 PM, Mete Atamel <matamel@adobe.com<mailto:matamel@adobe.com>>
wrote:
Hi,

I was looking at some of the values of Type enum in RepositoryStatistics class and these two
caught my attention:


        BUNDLE_COUNTER(true),

        BUNDLE_WS_SIZE_COUNTER(true),

I'm assuming BUNDLE_COUNTER is the current number of nodes/bundles and BUNDLE_WS_SIZE_COUNTER
is the total size of all nodes/bundles in the workspace (correct me if I'm wrong). The problem
is that these enums are initialized with true, meaning they will reset their counts to zero
every second and this is kind of weird.

For example, if 100 nodes are created at time 1 and time 3, you'd see something like this
for BUNDLE_COUNTER: 100, 0, 100, 0, 0, ... But I'd expect to see 100, 100, 200, 200, 200,
… Because BUNDLE_COUNT sounds like the total number of nodes at the current time rather
than new bundles created at the current time. If it were named NEW_BUNDLE_COUNT, I'd expect
to see 100, 0, 100, 0, 0, …Same goes for BUNDLE_WS_SIZE_COUNTER.

So, I'm suggesting that we change these enum values to initialize with false instead and I'm
curious to know what everyone else thinks.

Thanks,
Mete


Mime
View raw message