archiva-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joakim Erdfelt <joak...@apache.org>
Subject Re: Location of index files in 1.0?
Date Mon, 11 Jun 2007 16:50:39 GMT
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Wendy Smoak wrote:
<blockquote
 cite="mid:adba96190706110626y6679d24bhf94efeaf26f19474@mail.gmail.com"
 type="cite">Does Archiva still use Lucene for indexing?&nbsp; I can't find
the
  <br>
configuration for the index directory in a reasonably recent version.
  <br>
It used to be &lt;indexPath&gt; in archiva.xml.
  <br>
  <br>
If so, where is the index stored now? (And how do I move it?)
  <br>
  <br>
Thanks,
  <br>
</blockquote>
<br>
Archiva 1.0 and index file storage.<br>
<br>
<u>How it works:</u><br>
<br>
There are 2 modules in play.<br>
<tt>archiva-indexer<br>
archiva-configuration</tt><br>
<br>
In <tt>archiva-indexer</tt> there is an interface called...<br>
&nbsp; <a
 href="http://maven.apache.org/archiva/1.0/apidocs/org/apache/maven/archiva/indexer/RepositoryContentIndexFactory.html"><tt>org.apache.maven.archiva.indexer.RepositoryContentIndexFactory</tt></a><br>
With an default implementation ...<br>
&nbsp; <a
 href="http://maven.apache.org/archiva/1.0/apidocs/org/apache/maven/archiva/indexer/lucene/LuceneRepositoryContentIndexFactory.html"><tt>org.apache.maven.archiva.indexer.lucene.LuceneRepositoryContentIndexFactory</tt></a><tt>&nbsp;
<a
 href="http://maven.apache.org/archiva/1.0/xref/org/apache/maven/archiva/indexer/lucene/LuceneRepositoryContentIndexFactory.html">(xref)</a></tt><br>
which is responsible for setting up the various indexes.<br>
<br>
The index is per-repository (this is intentional, to allow for a
security around the repository to allow/deny a search based on roles,
etc...)<br>
<br>
The index directory is calculated using the following chunk of code
(found in <a
 href="http://maven.apache.org/archiva/1.0/xref/org/apache/maven/archiva/indexer/lucene/LuceneRepositoryContentIndexFactory.html#69"><tt>LuceneRepositoryContentIndexFactory</tt></a>)<br>
<br>
<pre>    <font color="#0000ff">/**</font>
<font color="#0000ff">     *</font><font color="#6a5acd"> Obtain the index
directory for the provided repository.</font><font
 color="#0000ff"> </font>
<font color="#0000ff">     * </font>
<font color="#0000ff">     * </font><font color="#6a5acd">@param</font><font
 color="#008b8b"> repository</font><font color="#0000ff"> the repository to
obtain the index directory from.</font>
<font color="#0000ff">     * </font><font color="#6a5acd">@param</font><font
 color="#008b8b"> indexId</font><font color="#0000ff"> the id of the index</font>
<font color="#0000ff">     * </font><font color="#6a5acd">@return</font><font
 color="#0000ff"> the directory to put the index into.</font>
<font color="#0000ff">     */</font>
    <font color="#2e8b57"><b>private</b></font> File toIndexDir( ArchivaRepository
repository, String indexId )
    {
        <font color="#a52a2a"><b>if</b></font> ( !repository.isManaged()
)
        {
            <font color="#a52a2a"><b>throw</b></font> <font
 color="#a52a2a"><b>new</b></font> IllegalArgumentException( <font
 color="#ff00ff">"Only supports managed repositories."</font> );
        }

        <font color="#0000ff">// Attempt to get the specified indexDir in the configuration
first.</font>
        RepositoryConfiguration repoConfig = configuration.getConfiguration().findRepositoryById(
repository.getId() );
        File indexDir;

        <font color="#a52a2a"><b>if</b></font> ( repoConfig == <font
 color="#ff00ff">null</font> )
        {
            <font color="#0000ff">// No configured index dir, use the repository path
instead.</font>
            String repoPath = repository.getUrl().getPath();
            indexDir = <font color="#a52a2a"><b>new</b></font> File(
repoPath, <font
 color="#ff00ff">".index/"</font> + indexId + <font color="#ff00ff">"/"</font>
);
        }
        <font color="#a52a2a"><b>else</b></font>
        {
            <font color="#0000ff">// Use configured index dir.</font>
            String repoPath = repoConfig.getIndexDir();
            <font color="#a52a2a"><b>if</b></font> ( StringUtils.isBlank(
repoPath ) )
            {
                repoPath = repository.getUrl().getPath();
                <font color="#a52a2a"><b>if</b></font> ( !repoPath.endsWith(
<font
 color="#ff00ff">"/"</font> ) )
                {
                    repoPath += <font color="#ff00ff">"/"</font>;
                }
                repoPath += <font color="#ff00ff">".index"</font>;
            }
            indexDir = <font color="#a52a2a"><b>new</b></font> File(
repoPath, <font
 color="#ff00ff">"/"</font> + indexId + <font color="#ff00ff">"/"</font>
);
        }

        <font color="#a52a2a"><b>return</b></font> indexDir;
    }</pre>
This means that if the indexDir is specified in the configuration for
this repository, it uses that as the 'topmost' directory for the
indexes.<br>
<br>
There are 3 types of indexes.<br>
<ul>
  <li><a
 href="http://maven.apache.org/archiva/1.0/apidocs/org/apache/maven/archiva/indexer/bytecode/package-summary.html">Bytecode</a>
- Holds the classnames, public method signatures, and packages names.</li>
  <li><a
 href="http://maven.apache.org/archiva/1.0/apidocs/org/apache/maven/archiva/indexer/filecontent/package-summary.html">FileContent</a>
- holds a raw file content index (for those files flagged as
'indexable' in the repository scan)</li>
  <li><a
 href="http://maven.apache.org/archiva/1.0/apidocs/org/apache/maven/archiva/indexer/hashcodes/package-summary.html">Hashcode</a>
- holds the file reference, as well as artifact reference, complete
with md5 and sha1 hashcodes.<br>
  </li>
</ul>
(Some $HOME/.m2/archiva.xml examples)<br>
Lets look at what happens with a simple repository definition.<br>
<pre>    <font color="#008b8b">&lt;</font><font color="#008b8b">repository</font><font
 color="#008b8b">&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">id</font><font
 color="#008b8b">&gt;</font>snapshots<font color="#008b8b">&lt;/id&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">name</font><font
 color="#008b8b">&gt;</font>Managed Snapshots Repository<font
 color="#008b8b">&lt;/name&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">url</font><font
 color="#008b8b">&gt;</font><a class="moz-txt-link-freetext" href="file:/home/joakim/java/archiva/snapshots/">file:/home/joakim/java/archiva/snapshots/</a><font
 color="#008b8b">&lt;/url&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">snapshots</font><font
 color="#008b8b">&gt;</font>true<font color="#008b8b">&lt;/snapshots&gt;</font>
    <font color="#008b8b">&lt;/repository&gt;</font></pre>
This is will result in the following directory structure ...<br>
<br>
<tt>[joakim@monolith .index]$ pwd<br>
/home/joakim/java/archiva/snapshots/.index<br>
[joakim@monolith .index]$ ls -la<br>
total 16<br>
drwxr-xr-x 4 joakim joakim 4096 2007-05-29 15:45 .<br>
drwxr-xr-x 6 joakim joakim 4096 2007-05-25 14:34 ..<br>
</tt><tt>drwxr-xr-x 2 joakim joakim 4096 2007-05-29 16:20 bytecode<br>
</tt><tt>drwxr-xr-x 2 joakim joakim 4096 2007-05-29 15:59 filecontent<br>
drwxr-xr-x 2 joakim joakim 4096 2007-05-29 16:10 hashcodes<br>
[joakim@monolith .index]$ </tt><br>
<br>
Notice, this is the default directory that is created for the lucene
indexes.<br>
It can be changed, observe the following setting (found only in the
archiva.xml file ATM, oddly, no GUI for this setting yet exists.&nbsp; Not
sure how that slipped thru)<br>
<pre>    <font color="#008b8b">&lt;</font><font color="#008b8b">repository</font><font
 color="#008b8b">&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">id</font><font
 color="#008b8b">&gt;</font>corporate<font color="#008b8b">&lt;/id&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">name</font><font
 color="#008b8b">&gt;</font>Managed Corporate Repository<font
 color="#008b8b">&lt;/name&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">url</font><font
 color="#008b8b">&gt;</font><a class="moz-txt-link-freetext" href="file:/home/joakim/java/archiva/corporate/">file:/home/joakim/java/archiva/corporate/</a><font
 color="#008b8b">&lt;/url&gt;</font>
      <font color="#008b8b">&lt;</font><font color="#008b8b">indexDir</font><font
 color="#008b8b">&gt;</font>/opt/indexes/corporate/<font color="#008b8b">&lt;/indexDir&gt;</font>
    <font color="#008b8b">&lt;/repository&gt;</font></pre>
This sets the index directory for the corporate repository to be in <tt>/opt/indexes/corporate/</tt>
.<br>
Remember, this is only a base directory for the indexes.&nbsp; Let Archiva
and Lucene manage the individual directories under this base directory.<br>
<br>
Hope this helps.<br>
<br>
Hmm.&nbsp; This should be put in the documentation. <span
 class="moz-smiley-s14"><span> O:-) </span></span><br>
(flags this as a todo, and continues work on <a
 href="http://jira.codehaus.org/browse/MRM-410">MRM-410</a>)<br>
<pre class="moz-signature" cols="72">-- 
- Joakim Erdfelt
  Committer and PMC Member, Apache Maven
  Archiva Developer
  <a class="moz-txt-link-abbreviated" href="mailto:joakim@apache.org">joakim@apache.org</a>
</pre>
</body>
</html>

Mime
View raw message