lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: How to create Index ?
Date Tue, 20 Sep 2005 02:48:10 GMT
Arpit,

It looks like you've omitted the import statements from  
Indexer.java.  The book omits import statements to conserve space,  
but they are important.  The code is provided in its entirety at  
http://www.lucenebook.com

In fact, you could build an index by running the code directly (read  
the README file and follow the instructions first) by typing "ant  
Indexer" and following the prompts.  One of the prompts asks you  
where to put the index itself, and the next prompt asks for the  
directory of text files to index.

     Erik



On Sep 19, 2005, at 10:34 PM, Arpit Sharma wrote:

> I have put the .jar file in C:\lucene and I have also
> unzip it and have also put all the directories(like
> analysis,index,store) in C:\ lucene.
>
> Now how to create a index ?
> all the text files are in C:\text directory. I have
> "lucene in action" book and with the help of it I made
> a  Indexer.java program in C:\lucene and when I tried
> to compile it it is giving lot's of errors.
> The code is fine(it is copy paste from the book).
>
> I am sure that there is some path problem. What should
> I do ?
>
> Thanks
>
> Here is the code of the Indexer.java:-
> ----------------
>
> /** * This code was originally written for
>  **   Erik's Lucene intro java.net article */
>
> public class Indexer {
>
>    public static void main(String[] args) throws
> Exception {
>
>        if (args.length != 2) {
>            throw new Exception("Usage: java " +
> Indexer.class.getName()
>            + " <index dir> <data dir>");
>        }
>
>        File indexDir = new File(args[0]);
>        File dataDir = new File(args[1]);
>
>        long start = new Date().getTime();
>        int numIndexed = index(indexDir, dataDir);
>        long end = new Date().getTime();
>
>        System.out.println("Indexing " + numIndexed + "
> files took "
>        + (end - start) + " milliseconds");
>
>   }
>
>   // open an index and start file directory traversal
>
>
>   public static int index(File indexDir, File dataDir)
>
>       throws IOException {
>           if (!dataDir.exists() || !dataDir.isDirectory()) {
>
>               throw new IOException(dataDir
>               + " does not exist or is not a directory");
>           }
>
>           IndexWriter writer = new IndexWriter(indexDir,
>
>           new StandardAnalyzer(), true);
>           writer.setUseCompoundFile(false);
>
>           indexDirectory(writer, dataDir);
>
>           int numIndexed = writer.docCount();
>
>          writer.optimize();
>          writer.close();
>
>          return numIndexed;
>      }
>
>      // recursive method that calls itself when it finds
> a directory
>
>      private static void indexDirectory(IndexWriter
> writer, File dir)
>          throws IOException {
>
>          File[] files = dir.listFiles();
>          for (int i = 0; i < files.length; i++) {
>              File f = files[i];
>              if (f.isDirectory()) {
>                  indexDirectory(writer, f);
>              } else if (f.getName().endsWith(".txt")) {
>
>                indexFile(writer, f);
>              }
>            }
>      }
>
>      // method to actually index a file using Lucene
>
>      private static void indexFile(IndexWriter writer,
> File f)
>         throws IOException {
>
>         if (f.isHidden() || !f.exists() || !f.canRead())
> {
>                 return;
>         }
>
>         System.out.println("Indexing " +
> f.getCanonicalPath());
>
>         Document doc = new Document();
>         doc.add(Field.Text("contents", new
> FileReader(f)));
>
>         doc.add(Field.Keyword("filename",
> f.getCanonicalPath()));
>         writer.addDocument(doc);
>         }
>       }
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>


Mime
View raw message