lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: How to create Index ?
Date Thu, 22 Sep 2005 18:56:08 GMT
Arpit - as was said below, the code is available from the Lucene in  
Action website (URL also below).

     Erik


On Sep 22, 2005, at 2:47 PM, Arpit Sharma wrote:

> Hi erik and others,
>
> Can you provide me the full code for Indexer program.
> Will really appreciate it.
>
> THanks alot.
>
> --- Erik Hatcher <erik@ehatchersolutions.com> wrote:
>
>
>> Arpit,
>>
>> It looks like you've omitted the import statements
>> from
>> Indexer.java.  The book omits import statements to
>> conserve space,
>> but they are important.  The code is provided in its
>> entirety at
>> http://www.lucenebook.com
>>
>> In fact, you could build an index by running the
>> code directly (read
>> the README file and follow the instructions first)
>> by typing "ant
>> Indexer" and following the prompts.  One of the
>> prompts asks you
>> where to put the index itself, and the next prompt
>> asks for the
>> directory of text files to index.
>>
>>      Erik
>>
>>
>>
>> On Sep 19, 2005, at 10:34 PM, Arpit Sharma wrote:
>>
>>
>>> I have put the .jar file in C:\lucene and I have
>>>
>> also
>>
>>> unzip it and have also put all the
>>>
>> directories(like
>>
>>> analysis,index,store) in C:\ lucene.
>>>
>>> Now how to create a index ?
>>> all the text files are in C:\text directory. I
>>>
>> have
>>
>>> "lucene in action" book and with the help of it I
>>>
>> made
>>
>>> a  Indexer.java program in C:\lucene and when I
>>>
>> tried
>>
>>> to compile it it is giving lot's of errors.
>>> The code is fine(it is copy paste from the book).
>>>
>>> I am sure that there is some path problem. What
>>>
>> should
>>
>>> I do ?
>>>
>>> Thanks
>>>
>>> Here is the code of the Indexer.java:-
>>> ----------------
>>>
>>> /** * This code was originally written for
>>>  **   Erik's Lucene intro java.net article */
>>>
>>> public class Indexer {
>>>
>>>    public static void main(String[] args) throws
>>> Exception {
>>>
>>>        if (args.length != 2) {
>>>            throw new Exception("Usage: java " +
>>> Indexer.class.getName()
>>>            + " <index dir> <data dir>");
>>>        }
>>>
>>>        File indexDir = new File(args[0]);
>>>        File dataDir = new File(args[1]);
>>>
>>>        long start = new Date().getTime();
>>>        int numIndexed = index(indexDir, dataDir);
>>>        long end = new Date().getTime();
>>>
>>>        System.out.println("Indexing " + numIndexed
>>>
>> + "
>>
>>> files took "
>>>        + (end - start) + " milliseconds");
>>>
>>>   }
>>>
>>>   // open an index and start file directory
>>>
>> traversal
>>
>>>
>>>
>>>   public static int index(File indexDir, File
>>>
>> dataDir)
>>
>>>
>>>       throws IOException {
>>>           if (!dataDir.exists() ||
>>>
>> !dataDir.isDirectory()) {
>>
>>>
>>>               throw new IOException(dataDir
>>>               + " does not exist or is not a
>>>
>> directory");
>>
>>>           }
>>>
>>>           IndexWriter writer = new
>>>
>> IndexWriter(indexDir,
>>
>>>
>>>           new StandardAnalyzer(), true);
>>>           writer.setUseCompoundFile(false);
>>>
>>>           indexDirectory(writer, dataDir);
>>>
>>>           int numIndexed = writer.docCount();
>>>
>>>          writer.optimize();
>>>          writer.close();
>>>
>>>          return numIndexed;
>>>      }
>>>
>>>      // recursive method that calls itself when it
>>>
>> finds
>>
>>> a directory
>>>
>>>      private static void
>>>
>> indexDirectory(IndexWriter
>>
>>> writer, File dir)
>>>          throws IOException {
>>>
>>>          File[] files = dir.listFiles();
>>>          for (int i = 0; i < files.length; i++) {
>>>              File f = files[i];
>>>              if (f.isDirectory()) {
>>>                  indexDirectory(writer, f);
>>>              } else if
>>>
>> (f.getName().endsWith(".txt")) {
>>
>>>
>>>                indexFile(writer, f);
>>>              }
>>>            }
>>>      }
>>>
>>>      // method to actually index a file using
>>>
>> Lucene
>>
>>>
>>>      private static void indexFile(IndexWriter
>>>
>> writer,
>>
>>> File f)
>>>         throws IOException {
>>>
>>>         if (f.isHidden() || !f.exists() ||
>>>
>> !f.canRead())
>>
>>> {
>>>                 return;
>>>         }
>>>
>>>         System.out.println("Indexing " +
>>> f.getCanonicalPath());
>>>
>>>         Document doc = new Document();
>>>         doc.add(Field.Text("contents", new
>>> FileReader(f)));
>>>
>>>         doc.add(Field.Keyword("filename",
>>> f.getCanonicalPath()));
>>>         writer.addDocument(doc);
>>>         }
>>>       }
>>>
>>> __________________________________________________
>>> Do You Yahoo!?
>>> Tired of spam?  Yahoo! Mail has the best spam
>>>
>> protection around
>>
>>> http://mail.yahoo.com
>>>
>>>
>>
>>
>>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>


Mime
View raw message