community-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gruno <>
Subject Re: Adding some statistics to projects.a.o?
Date Wed, 26 Oct 2016 22:22:07 GMT
On 10/27/2016 12:20 AM, sebb wrote:
> These explanations of the what the stats mean need to be provided on
> the page or linked from it.

Right, perhaps below/above each of them would be a good idea. I'll get
working on that tomorrow.

> On 26 October 2016 at 22:12, Daniel Gruno <> wrote:
>> On 10/26/2016 10:56 PM, Phil Steitz wrote:
>>> On 10/26/16 11:07 AM, Daniel Gruno wrote:
>>>> I added an initial stats page at
>>>> - assuming no one objects,
>>>> I'll add it to the top menu of the other pages in a day or so.
>>>> Do peruse - anything we need to add/edit?
>>> Maven is not a programming language.  What exactly is the
>>> denominator on that stat?  Number of files?  Lines of code?
>>> Projects primarily using?
>> I suspect it's scripts specifically for maven it's counting. the
>> denominator is lines of functional code (101 million in total, not
>> counting blanks and comments which take us to 150M total).
>>> What does lines changed mean?  It looks like lines changed is
>>> somehow supposed to be insertions plus deletions.  Where are the
>>> mods to lines?  Is this just counting -- and ++ out of diffs?  That
>>> is a very bad metric on how much code has actually changed or what a
>>> contribution is.  Formatting nits, creating RCs, etc generate huge
>>> amounts of this stuff without really contributing much.
>> AIUI, the huge ++/-- are weeded out in these charts, otherwise it would
>> be in the millions of lines of code changed some days. We have, on
>> average, 700-800 commits per business day to our repos, and with roughly
>> 100k additions according to the chart, that would indicate an average of
>> ~125 lines changed per commit. It's very possible that this includes
>> some automatic changes, I can't say. As they are somewhat static, I am
>> considering just scrapping that part, it probably doesn't show that much
>> of value to us.
>>> What in the heck is an "author?"  We eliminated @author tags years
>>> ago because *we don't think like that* - lets not regress.  If it
>>> means someone created a new file, what is different about that than
>>> just committing a patch of some kind?  I would drop that metric or
>>> just merge it into committers.
>> An author in this context is someone who authored a piece of code, a
>> committer is someone who committed the code to a repository. They need
>> not be the same person. In Subversion, they are the same, as svn does
>> not distinguish. In git, they are two different entities. Committers are
>> always ASF committers, authors can be any contributor to a project with
>> or without an apache account.
>>> I very much do not like the "leader board" concept, especially with
>>> a bogus metric like number of diff lines generated driving it.  I
>>> would drop that thing.
>> It's number of unique commits driving it, not number of diffs - that's a
>> secondary statistic. While we disagree on liking this, I'll definitely
>> take it under advisement as I work on the page. Note, it's not been made
>> public in the sense that the front page links to it just yet, I'll do
>> that once we are more aligned idea-wise.
>>> I would rather see "busiest" or "most active" projects defined by
>>> something more meaningful like number of issues resolved or number
>>> of releases.   So change at least the first metric on the bottom to
>>> number of issues resolved and maybe make the second one number of
>>> releases.
>> Number of releases would be nigh impossible, as we don't really keep
>> score of that, at all. Issues solved could be done easily, though we
>> don't have any formal mapping from issue tracker names back to our
>> projects, so it would probably show which JIRA/BZ instances are the most
>> active instead.
>> With regards,
>> Daniel.
>>> Phil
>>>> With regards,
>>>> Daniel.
>>>> On 10/26/2016 01:07 PM, Daniel Gruno wrote:
>>>>> Hi folks,
>>>>> I was wondering, since we have full access to Snoot for the ASF, why
>>>>> take advantage of that and add a statistics page to,
>>>>> showing the various live stats available (no. of commits/committers,
>>>>> largest repos by size/commits, proper language breakdown, relationship
>>>>> mapping, mail stats etc).
>>>>> I was inclined to JFDI, but I'd love to hear what others think about
>>>>> this. If I don't hear any loud objections, I'll add a stats page today,
>>>>> and we can see if it's of any use :)
>>>>> Comments? Suggestions? :)
>>>>> With regards,
>>>>> Daniel.
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail:
>>>>> For additional commands, e-mail:
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> For additional commands, e-mail:
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message