archiva-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brett Porter <>
Subject [discuss] repository metadata
Date Mon, 28 Jul 2008 04:31:39 GMT

For some time (probably close to 2 years!) I've been thinking about a  
different way of storing metadata in the repository due to limitations  
in Maven's metadata (both the maven-metadata.xml files and POM files)  
- primarily a lack of extensibility (and thus the tie to Maven  
itself). In addition, I'd like to avoid the requirement to have a  
database for Archiva to work (and rather have it as a useful addition  
for easy searchability).

I think it'd be interesting to start being able to attach arbitrary  
metadata to artifacts. For example:
* Maven metadata as it does now
* OSGi information extracted from the JAR
* Ivy metadata so we could bridge those repos and vice-versa
* references continuous build results
* references to historical coverage, test, etc results
* allowing users to add their own metadata types and attach them

About a year ago at DevZuz we had a lot of success with a prototype of  
a metadata repository (based on the now defunct Eclipse Kepler  
project) that could read in Maven repositories but also other  
repository types, and then store that in Kepler format and push it to  
other sources (we had a JPA store, for example). I'm not proposing to  
use any of that, but the idea worked out well.

So, I've been thinking about making some changes to Archiva along  
these lines.

I would see this as becoming the "state" Archiva knows about a  
repository (and it should be entirely self-contained). So the lucene  
indices, database, and others are purely alternate storage mechanisms  
of the metadata for various applications of it.

Each element is timestamped making scanning operations simpler. When  
you run a particular consumer it can just check if the metadata for it  
is there already, and add or update it as needed (so a "full scan"  
would be reasonably efficient still).

While the metadata can be stored inside the repository, it should be  
possible (and maybe preferable?) to store it completely separately.  
This also makes it possible for one metadata repository to represent  
multiple source artifact repositories (possibly on a different server  
- you could run an Archiva scan on that repository and push metadata  
updates over JMS/REST to the database/lucene-backed webapp).

I think it would be valuable to use a format that is easily bridged to  
web services to avoid too much rework..

That's just some of my thoughts for now. What do others think?

James, how would this fit with your new repository API?


Brett Porter

View raw message