community-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <>
Subject Re: Adding some statistics to projects.a.o?
Date Wed, 26 Oct 2016 22:20:40 GMT
These explanations of the what the stats mean need to be provided on
the page or linked from it.

On 26 October 2016 at 22:12, Daniel Gruno <> wrote:
> On 10/26/2016 10:56 PM, Phil Steitz wrote:
>> On 10/26/16 11:07 AM, Daniel Gruno wrote:
>>> I added an initial stats page at
>>> - assuming no one objects,
>>> I'll add it to the top menu of the other pages in a day or so.
>>> Do peruse - anything we need to add/edit?
>> Maven is not a programming language.  What exactly is the
>> denominator on that stat?  Number of files?  Lines of code?
>> Projects primarily using?
> I suspect it's scripts specifically for maven it's counting. the
> denominator is lines of functional code (101 million in total, not
> counting blanks and comments which take us to 150M total).
>> What does lines changed mean?  It looks like lines changed is
>> somehow supposed to be insertions plus deletions.  Where are the
>> mods to lines?  Is this just counting -- and ++ out of diffs?  That
>> is a very bad metric on how much code has actually changed or what a
>> contribution is.  Formatting nits, creating RCs, etc generate huge
>> amounts of this stuff without really contributing much.
> AIUI, the huge ++/-- are weeded out in these charts, otherwise it would
> be in the millions of lines of code changed some days. We have, on
> average, 700-800 commits per business day to our repos, and with roughly
> 100k additions according to the chart, that would indicate an average of
> ~125 lines changed per commit. It's very possible that this includes
> some automatic changes, I can't say. As they are somewhat static, I am
> considering just scrapping that part, it probably doesn't show that much
> of value to us.
>> What in the heck is an "author?"  We eliminated @author tags years
>> ago because *we don't think like that* - lets not regress.  If it
>> means someone created a new file, what is different about that than
>> just committing a patch of some kind?  I would drop that metric or
>> just merge it into committers.
> An author in this context is someone who authored a piece of code, a
> committer is someone who committed the code to a repository. They need
> not be the same person. In Subversion, they are the same, as svn does
> not distinguish. In git, they are two different entities. Committers are
> always ASF committers, authors can be any contributor to a project with
> or without an apache account.
>> I very much do not like the "leader board" concept, especially with
>> a bogus metric like number of diff lines generated driving it.  I
>> would drop that thing.
> It's number of unique commits driving it, not number of diffs - that's a
> secondary statistic. While we disagree on liking this, I'll definitely
> take it under advisement as I work on the page. Note, it's not been made
> public in the sense that the front page links to it just yet, I'll do
> that once we are more aligned idea-wise.
>> I would rather see "busiest" or "most active" projects defined by
>> something more meaningful like number of issues resolved or number
>> of releases.   So change at least the first metric on the bottom to
>> number of issues resolved and maybe make the second one number of
>> releases.
> Number of releases would be nigh impossible, as we don't really keep
> score of that, at all. Issues solved could be done easily, though we
> don't have any formal mapping from issue tracker names back to our
> projects, so it would probably show which JIRA/BZ instances are the most
> active instead.
> With regards,
> Daniel.
>> Phil
>>> With regards,
>>> Daniel.
>>> On 10/26/2016 01:07 PM, Daniel Gruno wrote:
>>>> Hi folks,
>>>> I was wondering, since we have full access to Snoot for the ASF, why not
>>>> take advantage of that and add a statistics page to,
>>>> showing the various live stats available (no. of commits/committers,
>>>> largest repos by size/commits, proper language breakdown, relationship
>>>> mapping, mail stats etc).
>>>> I was inclined to JFDI, but I'd love to hear what others think about
>>>> this. If I don't hear any loud objections, I'll add a stats page today,
>>>> and we can see if it's of any use :)
>>>> Comments? Suggestions? :)
>>>> With regards,
>>>> Daniel.
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> For additional commands, e-mail:
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message