incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Fernández <wik...@apache.org>
Subject Re: Apache Metrics, Not Apache Humans
Date Mon, 16 Nov 2015 04:58:38 GMT
Marko, the metrics approach has been discussed in the past, for instance
http://markmail.org/message/ubx3utli3bnltv75 So far my feeling is that the
ASF prefer to deliver of people to build an opinion of projects rather than
based them on pure statistical metrics. But I'd be happy to see something
like that.
Hi,

I was talking with Daniel Gruno and wrote the following ideas to him. Note
that these are just ideas and not based on any real momentary issue or
concern -- though a more general concern about how Apache should evolve.

Apache should NOT use a binary "podling" / "top-level" model. All projects
should simply have a "health score" and that health score is derived from
measurables. Because of Apache Infrastructure's centralized server model
(email lists, version control, distributions, homepages, etc.), it  has the
ability to gather metrics such as, for example, the distribution of pushes
to the repository, the branch factor of the mailing list, the centrality of
the project in the Central Maven repository dependency graph, the number of
non-sequisters (dead-end conversations) in the email chain, the length of
discussions in JIRA, etc. etc. Which metrics are important? Who care --
just make up things to glean from the wealth of information you already
have access to. Watch...

Next, the Apache members subjectively say which projects they think are
"good" (healthy). This can even be a global vote including everyone in the
world and (should be) dynamic over time as projects evolve with time.
Either way, lets say, the ranking says Apache Hadoop, Apache Solr, Apache
Commons, etc. are the (collective subjective's) "best" Apache projects.
Now, there should exist a multi-dimensional projection of the
aforementioned gleaned statistics what will have Hadoop, Solr, Commons,
etc. close to one another in metric-space (clustered). Likewise, low
ranking projects should be close to one another in this space and far from
Hadoop, Solr, Commons, etc. Find that projection and that is your "healthy
metric space."

>From here, all Apache projects have a computed "healthy" score(s) and when
users go to download, lets say, Lucene, they go: "Cool. This is a healthy
project." (it has a HEALTH.txt file distributed with it, lets say). What
that means is that Lucene, at that release was in the "healthy" cluster of
the metric space. This model has various benefits:

        1. There is no need to have philosophical arguments (not grounded
in measurables) about what rules a project should follow (bounded by law).
                - Perhaps a project that is exclusive, but is X is still in
the "healthy" subspace.
                - Perhaps having bad documentation is a "unhealthy" even
though Apache doesn't care about documentation.
                - Perhaps too much discussion causes a project to become
"unhealthy."
                - Perhaps … who knows? … let the statistics do the talking.
                -  Apache becomes a breeding ground for different models of
open source (bounded by law), not just "The Apache Way."
                        - And these models are measurable! Let us study the
act of open source.
        2. "Top-level" projects can fall from grace.
                - Currently, all "top-level" projects are "equal." This
should by dynamic as the mighty do fall.
                - It is possible for what are now "podlings" to be
"healthy" as they simply are coming into Apache.
                        - "The student is the master."
                - Hadoop 1.2.1 might be the healthiest version of Hadoop
(as I tend to believe). "Hadoop" is not a thing eternal.
        3. Less work for people.
                - No more VOTEing on graduation.
                - No more amorphous aesthetic arguments about "The Apache
Way."
                - No more long winded contradictory documentation about how
things should be done (bounded by law).

The Apache Way should be about metrics, not about philosophy as different
paths lead to the same mountain top <--- See! Is that random Buddhist
saying that everyone just "believes" even true? :) Get the human out of the
loop!

Thanks for reading,
Marko.

http://markorodriguez.com

P.S. The same should hold true for educational degrees. I graduate and now
forever I'm an expert in computers? Medical doctors too! A 90 year old
doctor can do surgery on me?!?!… Binary graduation is not "real." Metrics,
metrics, metrics --- we live in a world where this is possible. For every
"thing" good comes and goes, up and down…

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message