Mailing-List: contact dev-help@community.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@community.apache.org
MIME-Version: 1.0
In-Reply-To: <1776188.CZJzUyKLHs@herve-desktop>
References: 
 <CAOGo0Vb1if7z1SJ12efzTsL075aEDVdv6t5yUB0aCvgSHZp_-Q@mail.gmail.com>
	<3952417.emMKRS58qp@herve-desktop>
	<CAOGo0VZ2C-hDsTSaVQEwLPwUgE=CDvo2vhTM5bYtgLBB9_hXUg@mail.gmail.com>
	<1776188.CZJzUyKLHs@herve-desktop>
Date: Wed, 22 Jul 2015 00:16:40 +0100
Message-ID: 
 <CAOGo0VZT7mqo1ihyz79rmZuAoVnvtjoHq48TKF=sgTmJAdACjQ@mail.gmail.com>
Subject: Re: Unnecessary SVN commits [was: svn commit: r1691273 - in
 /comdev/projects.apache.org/site: doap/cxf/cxf.rdf doap/httpd/httpd.rdf
 json/foundation/projects.json json/projects/cxf.json
 json/projects/httpd.json]
From: sebb <sebbaz@gmail.com>
To: dev@community.apache.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 21 July 2015 at 06:45, Herv=C3=A9 BOUTEMY <herve.boutemy@free.fr> wrote:
> Le lundi 20 juillet 2015 01:43:11 sebb a =C3=A9crit :
>> On 19 July 2015 at 14:18, Herv=C3=A9 BOUTEMY <herve.boutemy@free.fr> wro=
te:
>> > time to explain what I have in mind, because I understand the reaction=
s
>> > about these svn content questions: but I need to explain why I think t=
hat
>> > it's not a bug, it's a feature :)
>> >
>> >
>> > 1. generated json files in svn
>> >
>> > even if they are generated, these ones are IMHO useful to ease people =
just
>> > wanting to work on information rendering, ie the site's html+javascrip=
t
>>
>> The current files can still be accessed from the web server; they
>> don't have to be in SVN to be useful.
> seems I was not clear: the question is not the web server.
> The question is the lambda ASF committer who does not have access to the =
web
> server but would like to contribute to the web part, fix an issue he sees=
 on
> the live site: currently, one svn checkout, read STRUCTURE.txt and start =
your
> local web server, and you can fix any html+css+javascript issue
>
>>
>> > Experience with releases.json not being in svn in the first place told=
 me
>> > that not having whole json content in svn was just increasing barrier =
to
>> > commits from whole ASF committers to projects directory visualization
>>
>> Or maybe it was just that the file formats were not clearly documented.
> no, the problem was not the format, it was the data (even if format
> documentation is something we need also).

But how can one provide the data if the format is not clearly documented?

>
>>
>> > 2. doap files in svn (copies of parsed content or generated ones)
>> >
>> > From the beginning of my work on projects-new, I had a question in min=
d:
>> > is
>> > DOAP itself a problem (since not easy, not well understood), or are th=
ere
>> > just problems about the way DOAP is used and explained to ASF committe=
rs
>> > (=3D not DOAP experts, if DOAP experts exist)?
>> >
>> > Any discussion on this list about that question lead to some people
>> > wanting to simply drop DOAP, because for them, implicitely, the format
>> > itself/only was the problem, without answering previous question (and
>> > without providing a better alternative =3D the show stopper for me: no=
,
>> > simply telling "json" is not a sufficient answer, there has at least t=
o
>> > be a schema)
>>
>> Indeed.
>> Abandoning DOAP and using JSON will just lead to exactly the same
>> problem down the line: *unless* the JSON schema is well designed and
>> documented. Likewise for any other replacement.
> +1
>
>>
>> It's usually obvious to the code/data developers who create the
>> initial codebase how everything hangs together, but as the codebase
>> matures the detailed knowledge will be lost unless it is documented.
>> It's usually possible to tweak existing code to make small fixes
>> without fully understanding the whole, but without a clear
>> understanding of the way the parts are designed to work together the
>> code (and data) tends to grow like spaghetti.
>>
>> The way that the ASF used the DOAP files was not properly documented
>> originally (it's a bit better now), but that tends to be the way with
>> developers - documentation is done after the event, if at all. This is
>> true of many of the new JSON files.
> IMHO, here, the requirement on documentation is even higher since a lot o=
f
> people will need to write data, without being involved in the code using =
the
> data.
>
>>
>> Note that when we refer to DOAP in this context we are referring to
>> the XML representation.
>> There might be a different representation that is easier to use.
> I'm not a semantic web expert: could we try to write down (in the wiki fo=
r
> example) one project RDF/XML DOAP and its equivalent in another notation?

Off-hand, I don't know what other representations exist, but I do know
that there are some complaints about the suitability of XML to
represent RDF.

>>
>> After all, we are trying to describe projects, so Description Of A
>> Project should be a good fit, even if using XML to define the DOAP is
>> not so suitable.
> +1 on the general logic
> but I could not find a good documentation on DOAP apart from the DOAP sch=
ema
> itself: did I miss something?

I don't know, but it seems logical otherwise DOAP would have never
gained any followers.

>>
>> Do we really want to design a new DOAP schema using JSON?
> I rephrase the same question: will we do better docuemntation if we reinv=
ent
> the wheel? (the answer may be "yes", but need real investment)

I'm not sure that is an equivalent question.
The point is that it takes a lot of effort to design and document a
suitable schema.
We need to be very sure that DOAP is unsuitable before replacing it.
Though it might be possible to replace DOAP/XML with DOAP/JSON or
DOAP/xyz with a lot less effort.

>>
>> > Then my first steps were:
>> > - improve projects new site and switch from projects old, as each proj=
ect
>> > page on projects-new more clearly shows information that comes from th=
e
>> > project's DOAP file (IMHO, projects old was failing at this, no pun
>> > intended): we'll see if ASF committers can improve their DOAP files (a=
s
>> > some already did since the switch)
>>
>> Yes, better presentation of the data should help to persuade PMCs to
>> fix/improve their data.
>>
>> > - the new DOAP listings location, that is like projects old, but
>> > simplified
>> > since only focused on DOAP listings and content (no code):
>> > http://svn.apache.org/viewvc/comdev/projects.apache.org/data/
>> >
>> > These are only the first steps IMHO before deciding if we should conti=
nue
>> > with DOAP or find a better alternative (yet to be found/proposed).
>> >
>> > I see 2 other steps:
>> > - clarify what committee DOAP files (also called "PMC descriptors") ar=
e
>> > supposed to contain, and how projects (maintained by the committees) a=
re
>> > supposed to link to the committee. As discussed previously, current
>> > convention [1] is really strange.
>>
>> I expect there was a good reason for this at the time, but the "magic"
>> behaviour of the URL is a bad idea in retrospect. Less typing, but
>> lots more special-case code.
>>
>> > And PMC members list are easier updated automatically
>> > from committee-info.txt than manually.
>>
>> Yes; that is waiting on INFRA-9942 which seems to have been ignored.
>> Perhaps you can prod infra as well.
> to me, this is not priority 1: since json files are in svn ( :) ), I can =
parse
> content regularly and update json files from my own computer
> Having the parsing fully automated will be useful sometime, but at the mo=
ment
> there is no strong pressure

But when you go on holiday, there is no-one to update the files.

I favour a script that runs whenever committee-info.txt is updated
(can use svnpubsub for this).
The script should convert committee-info.txt into one or two JSON
files, but not do anything else.
The current parsing script is really complicated and generates
additional output.
The generated files would then be available for other scripts to use.

>>
>> > - prefer https://projects.apache.org/doap/ to
>> > https://svn.apache.org/repos/asf/comdev/projects.apache.org/data/
>> > IMHO, /doap/ in projects site, with every ASF committer commit access,=
 and
>> > its per-committee directory containing both PMC descriptor and project=
s
>> > DOAP descriptors would be easier to understand and maintain than an XM=
L
>> > listing in svn then descriptors in a lot of different places
>> > And this would give a good canonical url for each DOAP file (easing wo=
rk
>> > on
>> > previous item)
>>
>> Agreed it would be easier to have a central location to maintain the DOA=
Ps.
>> I tried a similar with Commons, however several people wanted to keep
>> the DOAPs with the project code.
>>
>> But perhaps if all the DOAPs are together then the objection will be
>> overcome - at least there is a canonical location for them.
>>
>> And it would be a lot easier to fix the typos and syntax errors if all
>> the files were co-located.
>>
>> Note that this will require a good naming convention to avoid clashes
>> and keep track of everything.
> that's the purpose of the https://projects.apache.org/doap/ demo:
> https://projects.apache.org/doap/{committee id}/{project id}.rdf
> ie one directory per Apache committee/TLP/PMC (choose your wording)
>
> and pmc.rdf for the committee PMC data file
>
> really nothing hard
>
>>
>> But in the meantime, I still think it would be better to take local
>> copies and not commit them to SVN.
> here, I disagree :)
>
>>
>> > I know this is a long post: sorry, could not make it shorter.
>> >
>> > Switching from projects old to projects new without changing much thin=
gs
>> > to
>> > DOAP sources was only the beginning of a story: we need to define next
>> > steps.
>> Yes.
>> What data needs to be collected by PMCs?
>> In what format is it stored?
>> Where is it stored?
>>
>> These would probably be better discussed on a Wiki.
> does it mean you have a proposal?
>
> Regards,
>
> Herv=C3=A9
>
>>
>> > Regards,
>> >
>> > Herv=C3=A9
>> >
>> >
>> > [1] https://projects-old.apache.org/guidelines.html see 2 last bullets=
:
>> > - PMCs can be referenced as an rdf:resource that points at
>> > http://<pmc>.apache.org/. e.g.
>> > <asfext:pmc rdf:resource=3D"http://httpd.apache.org/" />.
>> > In this case, the PMC descriptor file must be called <pmc>.rdf and mus=
t be
>> > stored in the directory:
>> > http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projec=
ts/d
>> > ata_files/ - PMCs descriptors can also be stored anywhere else (e.g. o=
n
>> > the TLP website or in SVN), in which case they must be referenced usin=
g
>> > the full URL, for example
>> > <asfext:pmc rdf:resource=3D"http://tlp.apache.org/pmc/tlp.rdf" />
>> >
>> > Le dimanche 19 juillet 2015 09:48:17 sebb a =C3=A9crit :
>> >> On 15 July 2015 at 22:11,  <hboutemy@apache.org> wrote:
>> >> > Author: hboutemy
>> >> > Date: Wed Jul 15 21:11:32 2015
>> >> > New Revision: 1691273
>> >> >
>> >> > URL: http://svn.apache.org/r1691273
>> >> > Log:
>> >> > import projects DOAP files updates
>> >> >
>> >> > Modified:
>> >> >     comdev/projects.apache.org/site/doap/cxf/cxf.rdf
>> >> >     comdev/projects.apache.org/site/doap/httpd/httpd.rdf
>> >> >     comdev/projects.apache.org/site/json/foundation/projects.json
>> >> >     comdev/projects.apache.org/site/json/projects/cxf.json
>> >> >     comdev/projects.apache.org/site/json/projects/httpd.json
>> >>
>> >> Why are these copies being committed to SVN?
>> >>
>> >> Projects-old makes do with a local copy of the files which it keeps i=
n
>> >> sync with the ones listed in files.xml
>> >>
>> >> It seems wasteful and unnecessary to create new backup copies in SVN.
>> >>
>> >> AFAICT they are bound to be out of date as they are committed manuall=
y.
>> >>
>> >> Furthermore there is also a  danger that the wrong copy may be update=
d
>> >> by someone.
>