archiva-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joakim Erdfelt <joa...@erdfelt.com>
Subject Re: WIP/POC archiva-jarinfo
Date Tue, 11 Mar 2008 04:43:59 GMT
Whoops.

Forgot to mention that you can find this code at ...
https://svn.apache.org/repos/asf/maven/sandbox/trunk/archiva/archiva-jarinfo

- Joakim

Joakim Erdfelt wrote:
> I've been working on and off on a few different archiva related tools 
> / tasks / libs.
>
> Brett and Wendy convinced me to upload what I got and outline what 
> I've got in mind to let the creative juices flow. (besides, I'm 
> running out of time to commit to archiva, so this work will be slow to 
> progress if i do it alone).
>
> Concept: archiva-jarinfo.
>
> A library for jar indexing / searching / identification for local 
> repositories, arbitrary directories of jars, and even remote 
> repositories.
>
> For use by ...
>
>  * Archiva itself as a possible replacement for repository scanning, 
> indexing, and searching.
>    (Searching on checksums, filenames, classnames, imports, 
> identification fields, and even public / exposed methods)
>  * Archiva RepoMan WebStart Tool - a tool I've been wanting to help 
> identify and upload content to an Archiva repository.
>  * Archiva Maven Plugin - imagine typing $ mvn archiva:search 
> -Dquery=Logger and getting hits on
>    log4j, slf4j, commons-logging, plexus-logging, etc...  found from 
> results from local repository and remote repository.
>  * Q4E integration - adding some ability to q4e to search local 
> repository and remote repositories for dependencies.
>
> Some details.
>
> (Some of this exists and works, Some of it does not, remember this is 
> a Work in Progress)
>
> The existing repository scanning / indexing in Archiva server makes 
> some assumptions that have proven to be misguided (such as only 
> searching for new content based on timestamp).  The new approach that 
> archiva-jarinfo takes is to mitigate the time consuming part of the 
> scan that the new content timestamp check attempts to avoid, the 
> processing of the jar file.
> This is done by checking for a new xml file with the contents of the 
> jar file (called ${artifact}-${version}.jarinfo), if the file exists, 
> it's up to date, if it doesn't exist, the jar details are collected 
> and the jarinfo file is created.
> I've seen this useful if you sync or copy repository directories too. 
> as the jarinfo files come along for the ride and reduce the 
> requirements for archiva to determine the jar details yet again.
> The scan creates a Jar Info Bundle (*.jib file) that is just a jar 
> file with all of the *.jarinfo xml files in it, for consumption by 
> remote JarInfo clients to use for indexing purposes.
>
> The JarInfo client uses the JarInfo lib to create an index for 
> checksums, jar content filenames, and public/exposed bytecode 
> information.
>
> The JarInfo client can search local repos, remote repos, and even 
> arbitrary directories of jar files.
>
> The JarInfo client can take an anonymous Jar file and perform a series 
> of identification checks in an attempt to identify the Jar file based 
> on jar file contents, and even similarity to jar files found in the 
> JarInfo indexes.
>
> That's all the info I can squeeze out tonite, hopefully someone else 
> will find this useful.
>
> Thanks,
> - Joakim
>


Mime
View raw message