lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: How to create Index ?
Date Tue, 20 Sep 2005 21:25:28 GMT

Understanding the basics of how to compile and execute java programs is a
little outside the scope of the lucene mailing list(s).  You should start
by looking at some tutorials for using java on windows.  In particular,
how to include jar files in your classpath (both when compiling and
running java applications).  A quick skim of google results for "java
classpath tutorial" turned up this result, which may be helpful...

  http://www.kevinboone.com/classpath.html



: Date: Tue, 20 Sep 2005 14:14:36 -0700 (PDT)
: From: Arpit Sharma <pathfinder_of_india@yahoo.com>
: Reply-To: general@lucene.apache.org
: To: general@lucene.apache.org
: Subject: Re: How to create Index ?
:
: Thanks Erik but things still are not working. The
: source code which I have downloaded does have README
: file but it says "The JAR files in the lib directory
: need to be in your build and execution classpath to
: run manually." and I think I am not able to do that.
: Can you please tell me step by step how to do this. I
: am really sorry but I am very new to all this.
:
: I have untar the lucene1.4.3.jar file and keep it's
: folder is C:\org\apache\lucene than what shud I do ?
: please also tell me how to add classpaths ?
:
: Thanks alot
:
: --- Erik Hatcher <erik@ehatchersolutions.com> wrote:
:
: > Arpit,
: >
: > It looks like you've omitted the import statements
: > from
: > Indexer.java.  The book omits import statements to
: > conserve space,
: > but they are important.  The code is provided in its
: > entirety at
: > http://www.lucenebook.com
: >
: > In fact, you could build an index by running the
: > code directly (read
: > the README file and follow the instructions first)
: > by typing "ant
: > Indexer" and following the prompts.  One of the
: > prompts asks you
: > where to put the index itself, and the next prompt
: > asks for the
: > directory of text files to index.
: >
: >      Erik
: >
: >
: >
: > On Sep 19, 2005, at 10:34 PM, Arpit Sharma wrote:
: >
: > > I have put the .jar file in C:\lucene and I have
: > also
: > > unzip it and have also put all the
: > directories(like
: > > analysis,index,store) in C:\ lucene.
: > >
: > > Now how to create a index ?
: > > all the text files are in C:\text directory. I
: > have
: > > "lucene in action" book and with the help of it I
: > made
: > > a  Indexer.java program in C:\lucene and when I
: > tried
: > > to compile it it is giving lot's of errors.
: > > The code is fine(it is copy paste from the book).
: > >
: > > I am sure that there is some path problem. What
: > should
: > > I do ?
: > >
: > > Thanks
: > >
: > > Here is the code of the Indexer.java:-
: > > ----------------
: > >
: > > /** * This code was originally written for
: > >  **   Erik's Lucene intro java.net article */
: > >
: > > public class Indexer {
: > >
: > >    public static void main(String[] args) throws
: > > Exception {
: > >
: > >        if (args.length != 2) {
: > >            throw new Exception("Usage: java " +
: > > Indexer.class.getName()
: > >            + " <index dir> <data dir>");
: > >        }
: > >
: > >        File indexDir = new File(args[0]);
: > >        File dataDir = new File(args[1]);
: > >
: > >        long start = new Date().getTime();
: > >        int numIndexed = index(indexDir, dataDir);
: > >        long end = new Date().getTime();
: > >
: > >        System.out.println("Indexing " + numIndexed
: > + "
: > > files took "
: > >        + (end - start) + " milliseconds");
: > >
: > >   }
: > >
: > >   // open an index and start file directory
: > traversal
: > >
: > >
: > >   public static int index(File indexDir, File
: > dataDir)
: > >
: > >       throws IOException {
: > >           if (!dataDir.exists() ||
: > !dataDir.isDirectory()) {
: > >
: > >               throw new IOException(dataDir
: > >               + " does not exist or is not a
: > directory");
: > >           }
: > >
: > >           IndexWriter writer = new
: > IndexWriter(indexDir,
: > >
: > >           new StandardAnalyzer(), true);
: > >           writer.setUseCompoundFile(false);
: > >
: > >           indexDirectory(writer, dataDir);
: > >
: > >           int numIndexed = writer.docCount();
: > >
: > >          writer.optimize();
: > >          writer.close();
: > >
: > >          return numIndexed;
: > >      }
: > >
: > >      // recursive method that calls itself when it
: > finds
: > > a directory
: > >
: > >      private static void
: > indexDirectory(IndexWriter
: > > writer, File dir)
: > >          throws IOException {
: > >
: > >          File[] files = dir.listFiles();
: > >          for (int i = 0; i < files.length; i++) {
: > >              File f = files[i];
: > >              if (f.isDirectory()) {
: > >                  indexDirectory(writer, f);
: > >              } else if
: > (f.getName().endsWith(".txt")) {
: > >
: > >                indexFile(writer, f);
: > >              }
: > >            }
: > >      }
: > >
: > >      // method to actually index a file using
: > Lucene
: > >
: > >      private static void indexFile(IndexWriter
: > writer,
: > > File f)
: > >         throws IOException {
: > >
: > >         if (f.isHidden() || !f.exists() ||
: > !f.canRead())
: > > {
: > >                 return;
: > >         }
: > >
: > >         System.out.println("Indexing " +
: > > f.getCanonicalPath());
: > >
: > >         Document doc = new Document();
: > >         doc.add(Field.Text("contents", new
: > > FileReader(f)));
: > >
: > >         doc.add(Field.Keyword("filename",
: > > f.getCanonicalPath()));
: > >         writer.addDocument(doc);
: > >         }
: > >       }
: > >
: > > __________________________________________________
: > > Do You Yahoo!?
: > > Tired of spam?  Yahoo! Mail has the best spam
: > protection around
: > > http://mail.yahoo.com
: > >
: >
: >
:
:
:
:
: __________________________________
: Yahoo! Mail - PC Magazine Editors' Choice 2005
: http://mail.yahoo.com
:



-Hoss


Mime
View raw message