maven-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Connolly <stephen.alan.conno...@gmail.com>
Subject Re: [DISCUSS] Incorporating an ArchitectureId into the GAVCT of the repository
Date Fri, 02 Sep 2016 23:59:17 GMT
On Friday 2 September 2016, Tibor Digana <tibor.digana@googlemail.com>
wrote:

> @Stephen
> I know that you don't want to have too big change and maven1->maven2 but
> one way or another XML is strictly defined by XSD.
> We must accept that fact to accept a format with more freedom and therefore
> I would prefer code/interface instead of XML like Groovy script :
> *dependencies: ["com.example:foobar:1.0:jar"]*


The whole point of a switch from Pom to Project Dependency Trees file (aka
"consumer pom") is that the build pom format becomes irrelevant

The .pdt file is what communicates the dependency trees of each artifact at
the main (GAV) coordinates

The classifier, currently cannot provide for different dependencies...
Hence for example if you depend on com.spotify:docket-client's jar or its
shaded.jar you get the same dependency tree... And all the effort of
shading is undone in that the shaded dependencies are pulled in anyway

We can update the deploy plugin "now" to publish .pdt files (assuming we
can agree a schema)... With the layout I am proposing we can simply move
everyone over without forking repositories

It's nice that we are in between Nexus-Maven-User.


>
Not quite... And we'd have to coordinate adoption of a maven5 layout with
Archivia, Nexus, Jfrog... Plus to get that adoption we'd need to get all
the consumers of content to pick up on the maven5 layout

If we fork central we still have no guarantees that users will update their
settings to use the new layout... For compatibility with older builds
maven5 will have to understand the old layout (just as maven2 understood
the v1 layout)...

So there may well be no clients *demanding* support for v5 layout...

Proxies such as nexus will need support for the new layout as well for Corp
to pick up...

At work we cannot upgrade to Nexus 3 *yet* because we need staging
support... I can see lots of enterprise customers just not switching to a
v5 layout in any sort of hurry... Which means nobody can leverage the
better fixes in resolution

If we have a layout that can be superimposed (like my suggestion) then we
can start letting people pick up the improvements without
completely throwing away their existing investment...


> It has advantages that
> we define best practices for Users group but the same has penalty because
> Maven group takes this on their shoulder and the same group of Maven devs
> has to spend new effort. Maybe it is better to provide some more freedom to
> Users and Nexus and have Maven like deployer tool able to check the POM
> semantics and syntax  and credentials, etc.
>
> >>So that will not scale and prevents mirroring
> Why not?
> All I wanted is to tell you that POM can be polymorphic. The POM can be
> almost the same or extended without extending schema but meaning would be
> different which depends on architectureId.


>
The build Pom can be whatever we like... But unless we fork central, the
deployed Pom can only ever be modelVersion 4.0.0... I wish it were
otherwise, but to do otherwise would basically break central

If we want to expose improved dependency information to consumers we need
to do that in a different file... What I am provisionally calling the .pdt
file... To stop all the madness of being stuck on 4.0.0 from repeating, we
need the .pdt format to have a limited future compatibility contract...
Currently to only platform neutral technology to provide that would seem to
be XML... Wish it were otherwise but XSLT is a standard... There is not
currently any standard for transforming other formats (except you can do
JSON => XML then apply XSLT and then map back to JSON as detailed in the
XSLT specification)


> The classifier already does not
> appear in URL path but it alters the binary of the same
> groupId:artifactId:version. Classifier or Architecture is the same problem
> for me.
> It should scale because the point is to keep POM structure old and the REST
> server would store POM per architectureId.


>
This would prevent storing the GPG signatures of the POM/PDT... Which would
mean that you cannot verify the dependency tree you are resolving is the
one that was produced for the artifact... If somebody maliciously injected
a dependency on an artifact with a know vulnerability in order to force
your project to ship that vulnerability... That would be a bad thing.

(Bad enough that the default JFrog settings modify the Pom files that they
proxy)

Basically the Nexus/Artifactory/Jfrog cannot be rewriting the Pom/pdt on
the fly without becoming a MiM attack vulnerability... This is also why we
cannot really migrate the old central artifacts over to a v5 fork if
central as we'd basically have to turn off any attempts at GPG validation...

However we can provide an XSLT that converts a Pom into a .PDT so that any
client just has to know how to parse .PDT files and can validate the
signature of the XSLT and source file


>
> The same could be done already
> with classifier. Another advantage is storage which currently is a pure
> webcontent but nowadays storages are key-to-value or NoSql databases. We
> should give the Remote Repositories more freedom how the REST API is
> implemented, unlike webcontent however I believe nowadays Nexus team
> already understood this and maybe internally it is not pure webcontent.
> In few years later we or Nexus wants to add "cloudId" similar to what you
> introduced with "architectureId". Again the pair in database would be
> key=value is architectureId&cloudId=POM; just a pure example.
>
>
>
>
> On Fri, Sep 2, 2016 at 8:06 PM, Stephen Connolly <
> stephen.alan.connolly@gmail.com <javascript:;>> wrote:
>
> > On Friday 2 September 2016, Tibor Digana <tibor.digana@googlemail.com
> <javascript:;>>
> > wrote:
> >
> > > @Stephen
> > > IIUC the 3rd part artifact is a platform (architecture) specific
> > dependency
> > > for another project. Thus the 3rd party artifact can be a tree of
> > > dependencies for my project.
> > > In my imagination some mapping between _architecture_ and
> > > _dependency_tree_.
> > >
> > > >>we need to move Maven forward
> > > RESTful Maven is my answer.
> > > I may not to read documentation of 3rd party project. Instead let the
> > Maven
> > > retrieve a List of architectures from all dependencies via REST
> service:
> > > https://repo1.maven.org/maven2/*rest/api*/*query-architectures*
> > > /<fully_qualified_artifact_path>
> >
> >
> > So that will not scale and prevents mirroring
> >
> > Better is to store metadata and let the client retrieve the metadata and
> > parse the query itself
> >
> >
> > > the response will answer with list of architectures:
> > > linux-ppc, linux-arm64, linux-x86 and
> > > Consumer's project (POM) would require another architecture
> linux-x86_64
> > > and the build fails.
> > > Suppose the required architecture matched with linux-arm64 and the next
> > > query will ask for versions
> > > https://repo1.maven.org/maven2/*rest/api*/*query-versions*
> > > /<fully_qualified_artifact_path>
> > > finally query artifact binary
> > > https://repo1.maven.org/maven2/*rest/api*/*query-artifact-binary*
> > > /<fully_qualified_artifact_path>
> > >
> > >
> > That is getting towards irreproducible builds... I don't think we want to
> > end up there
> >
> > Now, you do not need to specify the pattern how the artifact is described
> > > groupId:artifactId::::.
> > > It's because the REST server provides Maven with all operations we can
> > > query concrete binary from repo.
> > > The deployer of POM can specify artifact description. Now it is
> > > architectureId been added but in next few years it will be something
> > else.
> > > Look if you deploy artifact to *https://repo1.maven.org/
> > > <https://repo1.maven.org/>maven2* via *rest/api* then Nexus server
> will
> > > check if the POM is compliant with modelVersion 4.0 *maven2* in the
> path.
> > > This way we separate artifacts POM which are not compatible, we let the
> > > maven break the backwards compatibility without breaking old
> repositories
> > > and old code line of Maven and layout, and finally it is the
> > responsibility
> > > of *consumer* what default URL of Maven Central is specified in his
> > > settings.xml and therefore what modelVersion as well.
> > >
> > >
> > Been there with maven1->maven2
> >
> > Consensus is "let's not do that again"
> >
> >
> > > So the next generation 5.0 would go to *https://repo1.maven.org/
> > > <https://repo1.maven.org/>maven5*.
> > >
> > > WDYT?
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Sep 1, 2016 at 12:07 PM, Stephen Connolly <
> > > stephen.alan.connolly@gmail.com <javascript:;> <javascript:;>>
wrote:
> > >
> > > > One of the things I feel is necessary to grow Maven in the
> modelVersion
> > > > 5.0.0 world is to start taking account of architecture specific
> > > artifacts.
> > > >
> > > > Currently, the Maven repository layout does not handle architecture
> > > > specific dependencies well.
> > > >
> > > > So, for example:
> > > >
> > > > Say I have a foo.jar that depends on a native library... bar.dll /
> > > > libbar.so / etc
> > > >
> > > > Ideally we'd like to say that foo just depends on bar...
> > > >
> > > > A consumer of foo that is running on, say my local machine, could
> then
> > > see
> > > > that I am running on os-x- x86_64 and because I am wanting to run
> > > tests...
> > > > it would look for bar with the architecture of `os-x-x86_64` to get
> the
> > > > native library for me
> > > >
> > > > When I am building the installer for windows on my os-x machine
> (using
> > > say
> > > > .NET and the WiX toolchain) the corresponding (future does not exist
> > yet)
> > > > maven plugin could request the win-x86 architecture of the dependency
> > and
> > > > the rpm plugin could request the linux-ppc, linux-arm64, linux-x86
> and
> > > > linux-x86_64 artifacts in order to produce the corresponding rpm
> > > > architecture artifacts
> > > >
> > > > So when I think about this concept... I feel it is important that we
> > > find a
> > > > way to introduce the architectureId into the GACVT of the repository.
> > > >
> > > > When we do this, to my mind, we need to be mindful that modelVersion
> > > 4.0.0
> > > > consumers would like to be able to consume these architecture
> specific
> > > > dependencies also... and the 4.0.0 GAV constraints will constrain the
> > > > possible solutions that we can pick if we value letting 4.0.0
> consumers
> > > > access these architecture specific artifacts via the `default` layout
> > we
> > > > currently employ for the maven repository.
> > > >
> > > > So the first things first... our current `default` layout transforms
> > the
> > > > GroupId:ArtifactId:Version:Classifier:Type into a repository URL of
> > > >
> > > > `${groupId.replaceAll('.','/')}/${artifactId}/${version}/${
> > > > artifactId}-${version}${classifier==null?'':'-'+classifier}.${type}`
> > > >
> > > > If we want to add architectureId into that URL Path and still have
> that
> > > > resolvable by GAVCT at a modelVersion 4.0.0 coordinate, we are
> > basically
> > > > left with stuffing the architectureId into one of the existing
> > > > components...
> > > >
> > > > Now when we think about an architecture specific artifact, the first
> > > thing
> > > > that comes to mind is that each architecture specific artifact most
> > > likely
> > > > has different dependencies... hopefully the .pdt file (that would be
> > > > deployed at the GAV without an architecture... modulo multi-machine
> > > builds)
> > > > would provide the architecture specific dependency trees so that
> > > > modelVersion 5.0.0 aware consumers would - just naturally - be aware
> of
> > > > those differences in dependencies
> > > >
> > > > But - if we want to give the modelVersion 4.0.0 consumers our best
> > > effort -
> > > > we probably need to give each architectureId it's own modelVersion
> > 4.0.0
> > > > pom.
> > > >
> > > > In other words, I do not think we should try to munge the
> > architectureId
> > > > into either classifier or type as both of those would force the
> > > > dependencies to be viewed as having the same dependencies in the
> > > > modelVersion 4.0.0 world
> > > >
> > > > So that leaves us with groupId, artifactId and version...
> > > >
> > > > I personally think version is a non-runner. In modelVersion 4.0.0 you
> > can
> > > > only depend on one version of a dependency at a time... version
> ranges
> > > > would become completely and utterly unusable (never mind that they
> are
> > > > unusable now)... plus my gut tells me that it would be a total mess!
> > > >
> > > > So that leaves groupId and artifactId... our choices basically boil
> > down
> > > to
> > > >
> > > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > > '${architectureId}.${artifactId}'
> > > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > > '${architectureId}-${artifactId}'
> > > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > > '${artifactId}.${architectureId}'
> > > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > > '${artifactId}-${architectureId}'
> > > > legacyGroupId == '${groupId}.${architectureId}'; legacyArtifactId ==
> > > > '${artifactId}'
> > > > legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
> > > > '${architectureId}'
> > > >
> > > > I personally think that the ones that place `architectureId`
> lexically
> > > > before `artifactId` are not "right"... the most important coordinate
> is
> > > the
> > > > groupId, the next most is the artifactId, then the architecture, then
> > the
> > > > version, etc
> > > >
> > > > So to my mind that leaves us with:
> > > >
> > > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > > '${artifactId}.${architectureId}'
> > > > legacyGroupId == '${groupId}'; legacyArtifactId ==
> > > > '${artifactId}-${architectureId}'
> > > > legacyGroupId == '${groupId}.${artifactId}'; legacyArtifactId ==
> > > > '${architectureId}'
> > > >
> > > > Now when we look at how, say, a modelVersion 4.0.0 consumer would use
> > > these
> > > > dependencies... the variant where we shift the artifactId into the
> > > groupId
> > > > would mean that you would end up with loads of `linux-arm`
> > > > "legacyArtifactId" dependencies in your modelVersion 4.0.0
> consumer...
> > > > which would presumably be ugly (just like now if you have two
> matching
> > > > `artifactId` dependencies in your .war which forces us to
> disambiguate
> > by
> > > > prefixing the groupId when copying into WEB-INF/lib)... so I am going
> > to
> > > > reject that one also.
> > > >
> > > > The convention seems to be that the artifactId does not contain a `.`
> > > with
> > > > most artifacts that I am aware of using `-` as the separator... this
> > > could
> > > > be used to argue either way... my preference is to run with `-` as
> the
> > > > separator... though I am open to using `.` to provide a convention
> that
> > > > architecture is distinguished using a `.`
> > > >
> > > > So how would this work...
> > > >
> > > > Ok, I have my foobar project that builds a .jar and the native
> > libraries
> > > > that are required by that .jar
> > > >
> > > > So from the reactor for that project we want to deploy
> > > >
> > > > com.example:foobar:::1.0:pom (the legacy pom for the .jar to allow
> > > > modelVersion 4.0.0 consumption of the jar)
> > > > com.example:foobar:::1.0:pdt (the modern project dependency trees for
> > all
> > > > attached artifacts)
> > > > com.example:foobar:::1.0:jar (the jar)
> > > > com.example:foobar::javadoc:1.0:jar (the javadoc jar)
> > > > com.example:foobar::sources:1.0:jar (the source jar)
> > > > com.example:foobar:win_x86::1.0:pom (the legacy pom for the 32-bit
> > DLL)
> > > > com.example:foobar:win_x86::1.0:dll (the 32-bit DLL... alternatively
> > the
> > > > type might be `native-library` or `lib` but let's assume DLL)
> > > > com.example:foobar:win_x86_64::1.0:pom (the legacy pom for the
> 64-bit
> > > DLL)
> > > > com.example:foobar:win_x86_64::1.0:dll (the 64-bit DLL)
> > > > com.example:foobar:osx_x86_64::1.0:pom (the legacy pom for the
> 64-bit
> > > OS-X
> > > > .dylib)
> > > > com.example:foobar:osx_x86_64::1.0:dylib (the 64-bit .dylib...
> > > > alternatively the type might be `native-library` or `lib` but let's
> > > assume
> > > > dylib)
> > > > com.example:foobar:elf_arm::1.0:pom (the legacy pom for the linux
> ARM
> > > .so)
> > > > com.example:foobar:elf_arm::1.0:so (the ARM .so ... alternatively
> the
> > > type
> > > > might be `native-library` or `lib` but let's assume so)
> > > > com.example:foobar:elf_x86::1.0:pom (the legacy pom for the linux
> x86
> > > > 32-bit .so)
> > > > com.example:foobar:elf_x86::1.0:so (the x86-32-bit .so)
> > > > com.example:foobar:elf_x86_64::1.0:pom (the legacy pom for the linux
> > x86
> > > > 64-bit .so)
> > > > com.example:foobar:elf_x86_64::1.0:so (the x86 64-bit .so)
> > > >
> > > > My main build machine cannot cross-compile for PPC or ARM64... so we
> > have
> > > > two other build machines that will want to produce the extra
> > architecture
> > > > specific artifacts...
> > > >
> > > > com.example:foobar:elf_ppc::1.0:pom (the legacy pom for the linux
> PPC
> > > .so)
> > > > com.example:foobar:elf_ppc::1.0:so (the PPC .so)
> > > >
> > > > and
> > > >
> > > > com.example:foobar:elf_arm_64::1.0:pom (the legacy pom for the linux
> > ARM
> > > > 64-bit .so)
> > > > com.example:foobar:elf_arm_64::1.0:so (the ARM 64-bit .so)
> > > >
> > > > In order to accommodate delayed deployment, I am going to suggest
> that
> > > the
> > > > PPC and ARM64 deployments should publish their *supplemental* pdts at
> > > their
> > > > coordinates, e.g.
> > > >
> > > > com.example:foobar:elf_ppc::1.0:pdt (the suplemental project
> > dependency
> > > > trees for the PPC reactor artifacts)
> > > >
> > > > and
> > > >
> > > > com.example:foobar:elf_arm_64::1.0:pdt (the suplemental project
> > > dependency
> > > > trees for the ARM64 reactor artifacts)
> > > >
> > > > So ultimately we would end up with the following files being deployed
> > (in
> > > > three "atomic" deployments):
> > > >
> > > > com/example/foobar/1.0/foobar-1.0.pom
> > > > com/example/foobar/1.0/foobar-1.0.pdt
> > > > com/example/foobar/1.0/foobar-1.0.jar
> > > > com/example/foobar/1.0/foobar-1.0-javadoc.jar
> > > > com/example/foobar/1.0/foobar-1.0-sources.jar
> > > > com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.pom
> > > > com/example/foobar-win_x86/1.0/foobar-win_x86-1.0.dll
> > > > com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.pom
> > > > com/example/foobar-win_x86_64/1.0/foobar-win_x86_64-1.0.dll
> > > > com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.pom
> > > > com/example/foobar-osx_x86_64/1.0/foobar-win_x86_64-1.0.dylib
> > > > com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.pom
> > > > com/example/foobar-elf_arm/1.0/foobar-elf_arm-1.0.so
> > > > com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.pom
> > > > com/example/foobar-elf_x86/1.0/foobar-elf_x86-1.0.so
> > > > com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.pom
> > > > com/example/foobar-elf_x86_64/1.0/foobar-elf_x86_64-1.0.so
> > > >
> > > > com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pom
> > > > com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.pdt
> > > > com/example/foobar-elf_ppc/1.0/foobar-elf_ppc-1.0.so
> > > >
> > > > com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom
> > > > com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pdt
> > > > com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so
> > > >
> > > > When a modelVersion 5.0.0 consumer does something like:
> > > >
> > > > compile: {
> > > >   dependencies: ["com.example:foobar:1.0:jar"]
> > > > }
> > > > test: {
> > > >   dependencies: ["org.junit:junit:5.0:jar"]
> > > > }
> > > >
> > > > and wants to run its tests on linux ARM64 it will start by resolving
> > > > `com/example/foobar/1.0/foobar-1.0.pdt` this will give it the
> > dependency
> > > > tree of the `.jar` which will declare an architecture dependent
> native
> > > > library dependency (somehow or other... this is why we may use
> > > > `native-library` as the "type")... because it knows that it is
> running
> > on
> > > > ARM64 architecture it will then know that it needs
> > > > `com.example:foobar:elf_arm_64::1.0:so` since this is not available
> in
> > > the
> > > > `com/example/foobar/1.0/foobar-1.0.pdt` trees it will then attempt
> to
> > > > download `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.
> pdt`
> > > if
> > > > that exists, it will use that tree... if it doesn't exist... we fail
> > the
> > > > build (technically we could fall back to checking for
> > > > `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.pom` and
> > > > `com/example/foobar-elf_arm_64/1.0/foobar-elf_arm_64-1.0.so` before
> > > > failing
> > > > the build... but as we know the artifacts were produced by a 5.0.0
> > aware
> > > > producer - as we have `com/example/foobar/1.0/foobar-1.0.pdt`
> > resolved)
> > > >
> > > > A modelVersion 4.0.0 consumer is not really going to be able to have
> as
> > > > flexible a build... but at least they can - through declarations such
> > as
> > > >
> > > > <dependency>
> > > >   <groupId>com.example</groupId>
> > > >   <artifactId>foobar-elf_arm64</artifactId>
> > > >   <version>1.0</version>
> > > >   <type>so</type>
> > > > </dependency>
> > > >
> > > > grab the .so to bundle into a .zip or installer and if they want to
> > > write a
> > > > pom with architecture based profile activation injecting test scoped
> > > > dependencies they can do that also
> > > >
> > > > WDYT?
> > > >
> > > > If anyone has any experience from the NMaven experiments, or
> learnings
> > > from
> > > > .deb or .rpm attempts to solve architecture dependent artifacts mixed
> > > with
> > > > noarch artifacts... please step forward and join the discussion.
> > > >
> > > > -Stephen
> > > >
> > > > Notes:
> > > >
> > > > 1. I am not saying what conventions will be used to define the
> > > > `architectureId` values here
> > > > 2. I am not discussing the schema for the .pdt files here... other
> than
> > > the
> > > > general priciple that they will contain multiple dependency trees for
> > > each
> > > > artifact produced by the project
> > > > 3. I am not discussing how a modelVersion 5.0.0 build would be
> invoked
> > or
> > > > detect that it should just do the PPC deployment
> > > > 4. This proposal does not include the new metadata schema that we
> would
> > > > likely require to assist with such a deployment format
> > > > 5. I am not discussing or proposing a modelVersion 5.0.0 schema
> > here... I
> > > > use a non-XML format to help people mentally disassociate thinking
> > about
> > > > the architectureId specific things from the current 4.0.0 way of
> doing
> > > > things
> > > >
> > >
> > >
> > >
> > > --
> > > Cheers
> > > Tibor
> > >
> >
> >
> > --
> > Sent from my phone
> >
>
>
>
> --
> Cheers
> Tibor
>


-- 
Sent from my phone

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message