Return-Path: X-Original-To: apmail-maven-dev-archive@www.apache.org Delivered-To: apmail-maven-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8A96310B06 for ; Mon, 25 Nov 2013 10:02:57 +0000 (UTC) Received: (qmail 4115 invoked by uid 500); 25 Nov 2013 10:02:53 -0000 Delivered-To: apmail-maven-dev-archive@maven.apache.org Received: (qmail 3956 invoked by uid 500); 25 Nov 2013 10:02:48 -0000 Mailing-List: contact dev-help@maven.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "Maven Developers List" Reply-To: "Maven Developers List" Delivered-To: mailing list dev@maven.apache.org Received: (qmail 3947 invoked by uid 99); 25 Nov 2013 10:02:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Nov 2013 10:02:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of stephen.alan.connolly@gmail.com designates 209.85.160.54 as permitted sender) Received: from [209.85.160.54] (HELO mail-pb0-f54.google.com) (209.85.160.54) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Nov 2013 10:02:40 +0000 Received: by mail-pb0-f54.google.com with SMTP id un15so5393427pbc.13 for ; Mon, 25 Nov 2013 02:02:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=7z2XKGfoywXIG03SyUw6S1B6WnT65CYHu9uFhCDdMZY=; b=u38Dy+tCTUc+4lk23jZ15T8lBrPSGwp/ZMV+DBm3/n9G7sdwvHIE9971TR9SVH1QVc dzwbI+asYro5aHSrWkXXP+cm+RqNp+x5VMVRgjoBbhftzhejGfXG1dgfYwZUk6qAHRsW AJiBfL6U/WxBTsFHYG7qBoW3vdmZh0DZnQjec0Fx3rXUC0d/to5CExU3YsHnljaUVq0u p3lZBQ0J3Q6DY5e1eewpDHt25yFHnOYDSyjuNBAbQ5XhQqPhS72dZO2bQ80XcMaNGdZR QVKG9G2LIyZAPAn+ZlJ2anQpN4FTBuLGRPkh5GCFLoQVJ9qkIBu0AA1B72WmZ+jUyqkI bamQ== MIME-Version: 1.0 X-Received: by 10.68.254.105 with SMTP id ah9mr17293585pbd.87.1385373740479; Mon, 25 Nov 2013 02:02:20 -0800 (PST) Received: by 10.68.48.163 with HTTP; Mon, 25 Nov 2013 02:02:20 -0800 (PST) In-Reply-To: References: <25664F3F-7531-4E73-92D9-E568C04E1C6C@tesla.io> <529184FB.60502@ifedorenko.com> <592a6a294337b55292919eb9873b9d16.squirrel@www.mosabuam.com> <20fbcb44564b5e88a691f99bf4af71f9.squirrel@www.mosabuam.com> Date: Mon, 25 Nov 2013 10:02:20 +0000 Message-ID: Subject: Re: Model Version 5.0.0 From: Stephen Connolly To: Maven Developers List Cc: Manfred Moser Content-Type: multipart/alternative; boundary=047d7b2e1043aebc6b04ebfd759a X-Virus-Checked: Checked by ClamAV on apache.org --047d7b2e1043aebc6b04ebfd759a Content-Type: text/plain; charset=ISO-8859-1 First off, and this is addressed at drive-by readers, most everyone else knows me well enough to know this anyway. I may be the PMC chair, but 99.99% of the things I say are not said as the PMC chair, instead they are said as a committer to the project who is interested in the current and future health. I do not have any extra special influence. (If I *were* stating something as PMC chair it would be sent from my apache account and I would put on the hat with a phrase such as "as PMC chair"... and it would probably only be to resolve a stalemate in a technical choice that had stalemated both the committers and the PMC and was threatening the reputation of the ASF... i.e. the board would likely be watching closely... ) On 25 November 2013 07:46, Kristian Rosenvold wrote: > IMO publishing to central/acrhiva would involve publishing the "richest" > format available. Based on use-agent identification (or lack of a given > request param indicating old-style client) the repository should be able to > down-transform a v5 pom to a v4 pom "on the fly" ? How would that handle GPG signatures of the pom file? If you allow for on-the-fly transformation then you loose the GPG signature immediately. The dependency information of a pom is critical to maintaining the integrity of the actual file. For example if you can modify the pom file you can replace one dependency with another that you control and gives you a hook with which to inject malicious code. I see that as a credible risk. For sure we have been less than awesome at providing users with the tools to verify the GPG signatures of resolved artifacts and pom files... but that does not mean we should mandate that the repository adopt technologies that render such tooling impossible to achieve. > We're not going to be > losing semantic > backward compatibility on any of the changes I've seen suggested yet ? > I think every change has a mapping back to modelVersion 4.0.0... not every change will have a complete mapping back to modelVersion 4.0.0... or to put it another way, we can transform 5.0.0 to 4.0.0... and similarly as the 5.0.0 model would be an extension of 4.0.0 we can extend it up... but 5.0.0 -> 4.0.0 -> 5.0.0 would loose information on the down transform... 5.0.0 is a double and 4.0.0 is a float (3.0.0 is an unsigned long ;-) ) > > Also, did I miss the bit where someone explained why the whole "how to > build" section cannot be stripped away upon publication ? I don't > understand why that means we need multuiple files. > Lets say, for sake of argument, that we decide to go with the build vs dependency-consumer split. When you checkout your source tree you will see something like pom.xml (or pom.json or whatever format we decide for *building*) So what do we deploy to the repo... well for org.machu.foo:foobar:1.0 * We have to publish a file that is parsable by 4.0.0 readers... otherwise we limit consumers of our artifact to those using a client that understands the new format... that would be bad... so foobar-1.0.pom gets published... it's a modelVersion 4.0.0 pom. *Because* we are publishing this for *consumers of the artifact* it does not need any build information. We can strip all that cruft out and just provide the . Further we can resolve *all* versions to pinned explicit versions and strip out scopes that do not make sense for consumers, e.g. `test`. We can even add exclusions based on the complete tree (which would allow removing `provided` scope dependencies as well as being necessary for the `provides` extra semantic information I want to see added). None of these changes will break existing clients. It does mean that a modelVersion 5.0.0 pom will not be able to generate a pom for 4.0.0 clients that contains some of the bug/features that some people seem to rely on, e.g. ${} expansion in ... but we don't need to maintain such guarantees when we have a new schema. * We want to be able to expose the new dependency model information to consumers that can understand such information... so lets publish that information as a (briefly searches for relatively unused file extensions... Dependency ModeL... ok) .dml file. This file will contain the dependencies, extra semantic information such as provides, etc. Again it would be fully resolved and not contain a reference to a parent .dml file. In other words, once you get that file you have everything needed to parse that file (well except perhaps for a transformation mapping to down-map the file into a format you *can* parse... my XSLT idea) * Finally we have the artifact itself, foobar-1.0.jar and all the gpg signature and the .md5 hashes and .sha1 hashes... (we could argue only the gpg signatures are needed... but older clients rely on the hashes, so we cannot really break them... though a good repository manager could certainly generate them if the gpg signatures verify) So the complete list in this scheme foobar-1.0.pom foobar-1.0.pom.gpg foobar-1.0.pom.md5 foobar-1.0.pom.sha1 foobar-1.0.pom.gpg.sha1 foobar-1.0.pom.gpg.md5 foobar-1.0.dml foobar-1.0.dml.gpg foobar-1.0.dml.md5 (could perhaps omit as only new clients will read this) foobar-1.0.dml.sha1 (could perhaps omit as only new clients will read this) foobar-1.0.dml.gpg.sha1 (could perhaps omit as only new clients will read this) foobar-1.0.dml.gpg.md5 (could perhaps omit as only new clients will read this) foobar-1.0.jar foobar-1.0.jar.gpg foobar-1.0.jar.md5 foobar-1.0.jar.sha1 foobar-1.0.jar.gpg.sha1 foobar-1.0.jar.gpg.md5 So that is all fine and dandy... Now we refactor the build, introducing a common parent pom, foobar-parent... what do we need to publish for that? * Well my first "get out of jail" card is to mandate that when building, you cannot use a parent pom that has a *newer* modelVersion than the child pom. Thus we do not have to worry about people using Maven 3.2 and trying to use foobar-parent:1.0 as their parent pom. We set the in the deployed modelVersion 4.0.0 pom to the required version of Maven and we inject an enforcer rule bound to the `validate` phase that immediately fails the build with a message stating that you cannot use it as a parent. Thus even if you use Maven 4.0.0 to build a modelVersion 4.0.0 pom, you still will not be able to have a modelVersion 5.0.0 parent. I think this is reasonable, as we cannot expect to fully down-map the dependency features let alone the build features. So we are deploying a *generated* modelVersion 4.0.0 pom as foobar-parent-1.0.pom which is stripped to just effective dependencies (etc as for the jar) and has a section that causes anyone who tries to use it as a parent pom from a pre-maven 4.0.0 format pom to get an immediate build failure. * There are valid cases where a parent pom can include a set of dependencies that are common to all child projects. It may not be a style that I like, but just as I am not going to give out if somebody writes their *project* and has the idiotic idea of using TABs to indent (I'll moan if I have to make a contribution to their project though) I do not think we should prevent such a use case. Additionally, and perhaps more importantly, there can be side artifacts for a pom packaging. Thus we really should be publishing a .dml file for the parent. Most likely it will be empty (we don't need because .dml files *never* include a parent reference) but the file is needed for any side-artifacts * What about people using this project *as a parent*... we need to deploy something for them... we can assume they will be able to understand our modelVersion and format (because we have used that get out of jail card already to prevent the modelVersion 4.0.0 children), so lets just deploy the pom with a classifier of build foobar-parent-1.0.pom foobar-parent-1.0.pom.gpg foobar-parent-1.0.pom.md5 foobar-parent-1.0.pom.sha1 foobar-parent-1.0.pom.gpg.sha1 foobar-parent-1.0.pom.gpg.md5 foobar-parent-1.0-build.pom foobar-parent-1.0-build.pom.gpg foobar-parent-1.0-build.pom.md5 (could perhaps omit as only new clients will read this) foobar-parent-1.0-build.pom.sha1 (could perhaps omit as only new clients will read this) foobar-parent-1.0-build.pom.gpg.sha1 (could perhaps omit as only new clients will read this) foobar-parent-1.0-build.pom.gpg.md5 (could perhaps omit as only new clients will read this) foobar-parent-1.0.dml foobar-parent-1.0.dml.gpg foobar-parent-1.0.dml.md5 (could perhaps omit as only new clients will read this) foobar-parent-1.0.dml.sha1 (could perhaps omit as only new clients will read this) foobar-parent-1.0.dml.gpg.sha1 (could perhaps omit as only new clients will read this) foobar-parent-1.0.dml.gpg.md5 (could perhaps omit as only new clients will read this) foobar-1.0-src.tar.gz (illustrating the most common side-artifact for pom projects) foobar-1.0-src.tar.gz.gpg foobar-1.0-src.tar.gz.md5 foobar-1.0-src.tar.gz.sha1 foobar-1.0-src.tar.gz.gpg.sha1 foobar-1.0-src.tar.gz.gpg.md5 That is my view of *one way* to get to modelVersion 5.0.0. I think that *technically* the above could work. There are issues: * Newer clients will go looking for the .dml file and then fall back to the .pom if the .dml is missing... that makes 5 requests (.dml, .pom, .pom.gpg, .jar, .jar.gpg - or replace .gpg with whatever hash you want) to get a .jar file rather than 4, in other words a 20% increase in requests for older artifacts... or 33% increase if you don't want integrity checks)... we could do a bulk generation of the .dml files... but then we have to generate gpg signatures for those files which would break the trust that gpg is supposed to inject. Given that older clients currently go hunting for two hashes I think we can ignore this issue, e.g. it's actually better than .pom, .pom.md5, .pom.sha1, .jar, .jar.md5, .jar.sha1 This would therefore be using the .gpg file as a download integrity check and then optionally an additional check that users can choose to turn on would be to check that the key used to sign is trusted. * I am not sure how down-model versioning would work in reality. So the idea here is that we say that the .dml file is a machine generated format. It makes sense, to me, that this would be XML (because XSLT is cross platform-ish). We would mandate that the first element be the modelVersion.. could be via a namespace or an element... does not matter too much for this. A parser thus reads the modelVersion easily. If it is a known modelVersion... fine, proceed with the parse. If it is a newer modelVersion then you go download org.apache.maven:model-mapping:${modelVersion}:xsl and run the .dml through that transformation... lather rinse repeat until you have a modelVersion that you understand... We would need to EITHER be very careful when publishing the XSLT files OR relax the rules on re-downloading non-SNAPSHOTs for org.apache.maven:model-mapping only (the later could produce irreproducible builds though) In any case I think that is how we can allow for future evolution of the .dml modelVersion (NOTE: this need not be the pom modelVersion...)... but where we have the greater need for schema change is on the build side not on the dependency list side... so I think it should not be too much of a concern... we just have to be very careful with .dml schema changes. What does this get us? * It lets us change the build schema * It lets us change the build format... the pom need not be XML any more In short, it frees us up to change. Is this the only way? Nope... it is the best way I can think of... I hope that somebody has a better suggestion and I fear that this is the best... but there certainly are a lot worse ways of evolving our schema -Stephen > > I'm exposed to "the competition" at @dayjob these days, and I must say I > think reducing verobosity and duplication is /the/ most important feture of > a v5 pom for me. > > Kristian > > --047d7b2e1043aebc6b04ebfd759a--