commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liviu Tudor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SANDBOX-341) [functor] New components: summarize and aggregate
Date Wed, 31 Aug 2011 17:01:10 GMT

    [ https://issues.apache.org/jira/browse/SANDBOX-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094689#comment-13094689
] 

Liviu Tudor commented on SANDBOX-341:
-------------------------------------

Hi Simone,

Thanks for the feedback -- my first commit, so I expected a lot of things to go wrong.
Since my code was initially committed to {{commons.lang}}, I didn't have access to the classes
you mentioned, so I will have a look and use them.
Secondly, regarding the Apache header, am I correct in assuming that simply copying and pasting
it from an existing class in the repository should be enough? Point taken about the {{@author}}
tag, that was generated automatically by my Eclipse (note to self: adjust my _Eclipse_ templates
to remove that and a few other things!).
I didn't spend too much time on unit testing purely because I wasn't sure still if this was
the right place to commit this component, now that I am on the right track I will spend some
time to implement those properly.
Last but not least, {{org.apache.commons.functor.summarizer.TimedSummarizer#MAIN_TIMER}} is
static in order to avoid creating a new one for each instance -- but you are right, it can
create a memory leak. I will actually spend some time on it and probably provide a factory
for creating instances of the {{TimedSummarizer}} class which will allow for the instance
to use it's own {{Timer}} or a shared one; this is so "power users" can use the shared one
and minimize the memory and threading footprint, while still allowing for Joe Average to avoid
the memory leakage by using a per-instance {{Timer}}. Do you think that would be a good idea?

> [functor] New components: summarize and aggregate
> -------------------------------------------------
>
>                 Key: SANDBOX-341
>                 URL: https://issues.apache.org/jira/browse/SANDBOX-341
>             Project: Commons Sandbox
>          Issue Type: Improvement
>          Components: Functor
>         Environment: JDK 1.6.0_25 but should work with any JDK 5+ (possibly 1.4 though
I haven't tested).
>            Reporter: Liviu Tudor
>            Priority: Minor
>              Labels: features
>         Attachments: commons-functor-aggregate+summarizer.zip
>
>
> This is the next step from https://issues.apache.org/jira/browse/SANDBOX-340 -- as instructed
I'm finally hoping to get the code in the right place and hopefully this is something that
the functor component could do with.
> Whereas initially I just started with the summarizer component, I have added now the
second one, the "aggregator" as they are somehow related. If this code proves to be useful
to functor in any way, it would actually be good to get some feedback on these 2 to see if
the class hierarchy can in fact be changed to share some common functionality as I feel (probably
due to the similar needs that lead to writing/using these components) that somehow they should
share a common base.
> In brief, the 2 components:
> * aggregator: this just allows for data to be aggregated in a user defined way (e.g.
stored in a list for the purpose of averaging, computing the arithmetic median etc). The classes
provided actually offer the implementation for storing data in a list and computing the above-mentioned
values or summing up everything.
> * timed summarizer: this is another variation of the aggreator, however, it adds the
idea of regular "flushes", so based on a timer it will reset the value and start summing/aggregating
the data again. Rather than using an aggregator which would store the whole data series (possibly
for applying more complex formulas), this component just computes on the fly on each request
the formula and stores the result of it. (Which does mean things like computing arithmetic
mean, median etc would be difficult to compute without knowing upfront how many calls will
be received -- i.e. how many elements we will be required to summarize/aggregate.) So the
memory footprint of running this is much smaller -- even though, as I said, it achieves similar
results. I have only provided a summarizer which operates on integers, but obviously others
for float, double etc can be created if we go ahead with this design.
> Hopefully the above make sense; this code has resulted from finding myself writing similar
components to these a few times and because it's always been either one type (e.g. aggregator)
or another (summarizer) I haven't given quite possibly enough thought to the class design
to join these 2. Also, unfortunately, the time I sat down to make these components a bit more
general and submitted issue 340 was nearly 3 months ago so I'm trying to remember myself all
the ideas I had at a time so bear with me please if these are still  a bit fuzzy :) However,
if you can make use of these I'm quite happy to elaborate on areas that are unclear and obviously
put some effort into getting these components to the standards required to put these into
a release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message