archiva-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joakim Erdfelt <joa...@erdfelt.com>
Subject MRM-463 - metadata handling / merging
Date Wed, 15 Aug 2007 00:14:23 GMT
MRM-463 has a bunch of unanswered questions for me.

[ link for the lazy to use http://jira.codehaus.org/browse/MRM-463 ]

.\ Synopsis \.

We have to maintain a sane metadata.xml for the m2 clients.


.\ Details \.

There are 2 major kinds of metadata.xml from what I can see.


.\ Metadata Type 1: [ groupId:artifactId ] \.

One that is obtained at the groupId:artifactId level, and contains a set 
of available versions for a specific artifactId.  With a link to the 
'current' or 'latest' version.


Example 1: 
http://repo1.maven.org/maven2/commons-beanutils/commons-beanutils/maven-metadata.xml

<metadata>
  <groupId>commons-beanutils</groupId>
  <artifactId>commons-beanutils</artifactId>
  <version>1.0</version>
  <versioning>
    <versions>
      <version>1.0</version>
      <version>1.2</version>
      <version>1.3</version>
      <version>1.4</version>
      <version>1.4-dev</version>
      <version>1.4.1</version>
      <version>1.5</version>
      <version>1.6</version>
      <version>1.6.1</version>
      <version>1.7-dev</version>
      <version>1.7.0</version>
      <version>20020520</version>
      <version>20021128.082114</version>
      <version>20030211.134440</version>
      <version>dev</version>
    </versions>
  </versioning>
</metadata>

This example is actually bad IMO, as the top level version 
/metadata/version element isn't the latest version, or the current 
version, or even the last uploaded version.


.\ Metadata Type 2: [ groupId:artifactId:version ] \.

This type is version specific.

So far, most released artifacts have this in their directory, but it is 
not terribly useful IMO.

Example 2: 
http://repo1.maven.org/maven2/commons-beanutils/commons-beanutils/1.6.1/maven-metadata.xml

<metadata>
  <groupId>commons-beanutils</groupId>
  <artifactId>commons-beanutils</artifactId>
  <version>1.6.1</version>
</metadata>

Not very exciting, quite easy actually.

But when we deal with snapshots, it becomes critical.

First we'll look at an artifact with Timestamped snapshots.
Example 3: 
http://snapshots.repository.codehaus.org/org/codehaus/xfire/xfire-core/1.2-SNAPSHOT/maven-metadata.xml

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <groupId>org.codehaus.xfire</groupId>
  <artifactId>xfire-core</artifactId>
  <version>1.2-SNAPSHOT</version>
  <versioning>
    <snapshot>
      <timestamp>20070612.101111</timestamp>
      <buildNumber>63</buildNumber>
    </snapshot>
    <lastUpdated>20070612101133</lastUpdated>
  </versioning>
</metadata>

Next, here is an example without Timestamped snapshots.
Example 4: 
http://snapshots.repository.codehaus.org/org/codehaus/groovy/groovy/1.1-beta-2-SNAPSHOT/maven-metadata.xml

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <groupId>org.codehaus.groovy</groupId>
  <artifactId>groovy</artifactId>
  <version>1.1-beta-2-SNAPSHOT</version>
  <versioning>
    <snapshot>
      <buildNumber>2</buildNumber>
    </snapshot>
    <lastUpdated>20070616042726</lastUpdated>
  </versioning>
</metadata>

Next, here is an example of an artifact with Timestamped and non 
Timestamped artifacts.  To see this you'll need to browse the directory: 
http://people.apache.org/repo/m2-snapshot-repository/org/apache/cocoon/cocoon-ajax/1-SNAPSHOT/

(course, this is appears to be a case of someone uploading their local 
repository)

Example 5: 
http://people.apache.org/repo/m2-snapshot-repository/org/apache/cocoon/cocoon-ajax/1-SNAPSHOT/maven-metadata.xml

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <groupId>org.apache.cocoon</groupId>
  <artifactId>cocoon-ajax</artifactId>
  <version>1-SNAPSHOT</version>
  <versioning>
    <snapshot>
      <timestamp>20060728.031822</timestamp>
      <buildNumber>10</buildNumber>
    </snapshot>
    <lastUpdated>20060728031823</lastUpdated>
  </versioning>
</metadata>


.\ Areas of Concern \.

Alright, I think we have few areas of focus around this.

1) The repository consumer run to ensure that a metadata.xml file exists.
2) The database consumer run to ensure that the contents of the metadata.xml
   are sane based on the list of available versions in the database.
3) When proxying content from a remote repository for a released artifact,
   the type 1 groupId:artifactId needs to be updated to reflect the new
   artifactId:version that was just downloaded.
4) When proxying content from a remote repository for a snapshot artifact,
   the proxy mechanism needs to pull the remote metadata.xml to determine
   what actual artifactId:version to pull.  Is it timestamped or not?
5) When proxying content from a remote repository for a snapshot artifact,
   the managed repository needs to have the current metadata.xml for
   remote repository?
6) When presenting to the user browsing the repository, do we show what is
   in the managed repository, or the full list of potential versions 
from all
   downstream remote repositories too?


.\ Ideas \.

I think it would be a good idea to adopt what happens in the local 
repository
now, and have maven-metadata-${remote_repo_id}.xml files in the managed
repository that the proxy mechanism keeps up to date, and the seperate merge
mechanism utilizes to keep the managed repository metadata.xml as accurate
as possible with all potential versions available.

WDYT?

--
- Joakim Erdfelt
  joakime@apache.org
  joakim@erdfelt.com

Mime
View raw message