lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From carl...@apache.org
Subject cvs commit: jakarta-lucene/docs demo2.html
Date Sat, 26 Jan 2002 16:38:13 GMT
carlson     02/01/26 08:38:13

  Added:       docs     demo2.html
  Log:
  Getting Started tutorial added by Andrew C. Oliver.
  
  Revision  Changes    Path
  1.1                  jakarta-lucene/docs/demo2.html
  
  Index: demo2.html
  ===================================================================
  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  
  <!-- Content Stylesheet for Site -->
  
          
  <!-- start the processing -->
      <!-- ====================================================================== -->
      <!-- Main Page Section -->
      <!-- ====================================================================== -->
      <html>
          <head>
              <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
  
                                                      <meta name="author" value="Andrew
C. Oliver">
              <meta name="email" value="acoliver@apache.org">
              
              <title>Jakarta Lucene - Jakarta Lucene - Basic Demo Sources Walkthrough</title>
          </head>
  
          <body bgcolor="#ffffff" text="#000000" link="#525D76">        
              <table border="0" width="100%" cellspacing="0">
                  <!-- TOP IMAGE -->
                  <tr>
                      <td align="left">
  <a href="http://jakarta.apache.org"><img src="http://jakarta.apache.org/images/jakarta-logo.gif"
border="0"/></a>
  </td>
  <td align="right">
  <a href="http://jakarta.apache.org/lucene/"><img src="./images/lucene_green_300.gif"
alt="Jakarta Lucene" border="0"/></a>
  </td>
                  </tr>
              </table>
              <table border="0" width="100%" cellspacing="4">
                  <tr><td colspan="2">
                      <hr noshade="" size="1"/>
                  </td></tr>
                  
                  <tr>
                      <!-- LEFT SIDE NAVIGATION -->
                      <td width="20%" valign="top" nowrap="true">
                                  <p><strong>About</strong></p>
          <ul>
                      <li>    <a href="./index.html">Overview</a>
  </li>
                      <li>    <a href="./powered.html">Powered by Lucene</a>
  </li>
                      <li>    <a href="./whoweare.html">Who We Are</a>
  </li>
                      <li>    <a href="http://jakarta.apache.org/site/mail.html">Mailing
Lists</a>
  </li>
                  </ul>
              <p><strong>Resources</strong></p>
          <ul>
                      <li>    <a href="http://www.lucene.com/cgi-bin/faq/faqmanager.cgi">FAQ
(Official)</a>
  </li>
                      <li>    <a href="./gettingstarted.html">Getting Started</a>
  </li>
                      <li>    <a href="http://www.jguru.com/faq/Lucene">JGuru
FAQ</a>
  </li>
                      <li>    <a href="http://jakarta.apache.org/site/bugs.html">Bugs</a>
  </li>
                      <li>    <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Lucene
Bugs</a>
  </li>
                      <li>    <a href="./resources.html">Articles</a>
  </li>
                      <li>    <a href="./api/index.html">Javadoc</a>
  </li>
                      <li>    <a href="./contributions.html">Contributions</a>
  </li>
                  </ul>
              <p><strong>Download</strong></p>
          <ul>
                      <li>    <a href="http://jakarta.apache.org/site/binindex.html">Binaries</a>
  </li>
                      <li>    <a href="http://jakarta.apache.org/site/sourceindex.html">Source
Code</a>
  </li>
                      <li>    <a href="http://jakarta.apache.org/site/cvsindex.html">CVS
Repositories</a>
  </li>
                  </ul>
              <p><strong>Jakarta</strong></p>
          <ul>
                      <li>    <a href="http://jakarta.apache.org/site/getinvolved.html">Get
Involved</a>
  </li>
                      <li>    <a href="http://jakarta.apache.org/site/acknowledgements.html">Acknowledgements</a>
  </li>
                      <li>    <a href="http://jakarta.apache.org/site/contact.html">Contact</a>
  </li>
                      <li>    <a href="http://jakarta.apache.org/site/legal.html">Legal</a>
  </li>
                  </ul>
                          </td>
                      <td width="80%" align="left" valign="top">
                                                                      <table border="0"
cellspacing="0" cellpadding="2" width="100%">
        <tr><td bgcolor="#525D76">
          <font color="#ffffff" face="arial,helvetica,sanserif">
            <a name="About the Code"><strong>About the Code</strong></a>
          </font>
        </td></tr>
        <tr><td>
          <blockquote>
                                      <p>
  In this section we walk through the sources behind the basic Lucene demo such as where to

  find it, its parts and their function.  This section is intended for Java developers
  wishing to understand how to use Jakarta Lucene in their applications.
  </p>
                              </blockquote>
          </p>
        </td></tr>
        <tr><td><br/></td></tr>
      </table>
                                                  <table border="0" cellspacing="0" cellpadding="2"
width="100%">
        <tr><td bgcolor="#525D76">
          <font color="#ffffff" face="arial,helvetica,sanserif">
            <a name="Location of the source"><strong>Location of the source</strong></a>
          </font>
        </td></tr>
        <tr><td>
          <blockquote>
                                      <p>
  Relative to the directory created when you extracted Lucene or retreived it from CVS, you

  should see a directory called "src" which in turn contains a directory called "demo".
  This is the root for all of the Lucene demos.  Under this directory is org/apache/lucene/demo,
  this is where all the Java sources live.  
  </p>
                                                  <p>
  Within this directory you should see the IndexFiles class we executed earlier.  Bring that
  up in vi or your alternative text editor and lets take a look at it.
  </p>
                              </blockquote>
          </p>
        </td></tr>
        <tr><td><br/></td></tr>
      </table>
                                                  <table border="0" cellspacing="0" cellpadding="2"
width="100%">
        <tr><td bgcolor="#525D76">
          <font color="#ffffff" face="arial,helvetica,sanserif">
            <a name="IndexFiles"><strong>IndexFiles</strong></a>
          </font>
        </td></tr>
        <tr><td>
          <blockquote>
                                      <p>
  As we discussed in the previous walkthrough, the IndexFiles class creates a Lucene Index.
  Lets take a look at how it does this.  
  </p>
                                                  <p>
  The first substantial thing the main function does is instantiate an instance
  of IndexWriter.  It passes a string called "index" and a new instance of a class called
  "StandardAnalyzer".  The "index" string is the name of the directory that all index information
  should be stored in.  Because we're not passing any path information, one must assume this
  will be created as a subdirectory of the current directory (if does not already exist).
On
  some platforms this may actually result in it being created in other directories (such as

  the user's home directory). 
  </p>
                                                  <p>
  The <b>IndexWriter</b> is the main class responsible for creating indicies.
To use it you
  must instantiate it with a path that it can write the index into, if this path does not

  exist it will create it, otherwise it will refresh the index living at that path.  You 
  must a also pass an instance of <b>org.apache.analysis.Analyzer</b>. 
  </p>
                                                  <p>
  The <b>Analyzer</b>, in this case, the <b>Stop Analyzer</b> is little
more than a standard Java
  Tokenizer, converting all strings to lowercase and filtering out useless words from the
index.
  By useless words I mean common language words such as articles (a,an,the) and other words
that
  would be useless for searching.  It should be noted that there are different rules for every

  language, and you should use the proper analyzer for each.  Lucene currently provides Analyzers
  for English and German.
  </p>
                                                  <p>
  Looking down further in the file, you should see the indexDocs() code.  This recursive function

  simply crawls the directories and uses FileDocument to create Document objects.  The Document
  is simply a data object to represent the content in the file as well as its creation time
and 
  location.  These instances are added to the indexWriter.  Take a look inside FileDocument.
 Its
  not particularly complicated, it just adds fields to the Document.
  </p>
                                                  <p>
  As you can see there isn't much to creating an index.  The devil is in the details.  You
may also
  wish to examine the other samples in this directory, particularly the IndexHTML class. 
It is 
  a bit more complex but builds upon this example.
  </p>
                              </blockquote>
          </p>
        </td></tr>
        <tr><td><br/></td></tr>
      </table>
                                                  <table border="0" cellspacing="0" cellpadding="2"
width="100%">
        <tr><td bgcolor="#525D76">
          <font color="#ffffff" face="arial,helvetica,sanserif">
            <a name="Searching Files"><strong>Searching Files</strong></a>
          </font>
        </td></tr>
        <tr><td>
          <blockquote>
                                      <p>
  The SearchFiles class is quite simple.  It primarily collaborates with an IndexSearcher,
StandardAnalyzer
  (which is used in the IndexFiles class as well) and a QueryParser.  The query parser is
constructed
  with an analyzer used to interperate your query in the same way the Index was interperated:
finding 
  the end of words and removing useless words like 'a', 'an' and 'the'.  The Query object
contains the 
  results from the QueryParser which is passed to the searcher.  The searcher results are
returned in 
  a collection of Documents called "Hits" which is then iterated through and displayed to
the user.
  </p>
                              </blockquote>
          </p>
        </td></tr>
        <tr><td><br/></td></tr>
      </table>
                                                  <table border="0" cellspacing="0" cellpadding="2"
width="100%">
        <tr><td bgcolor="#525D76">
          <font color="#ffffff" face="arial,helvetica,sanserif">
            <a name="The Web example..."><strong>The Web example...</strong></a>
          </font>
        </td></tr>
        <tr><td>
          <blockquote>
                                      <p>
  <a href="demo3.html">read on&gt;&gt;&gt;</a>
  </p>
                              </blockquote>
          </p>
        </td></tr>
        <tr><td><br/></td></tr>
      </table>
                                          </td>
                  </tr>
  
                  <!-- FOOTER -->
                  <tr><td colspan="2">
                      <hr noshade="" size="1"/>
                  </td></tr>
                  <tr><td colspan="2">
                      <div align="center"><font color="#525D76" size="-1"><em>
                      Copyright &#169; 1999-2002, Apache Software Foundation
                      </em></font></div>
                  </td></tr>
              </table>
          </body>
      </html>
  <!-- end the processing -->
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message