lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fernando Luiz Engelmann Junior <ferna...@softexpert.com>
Subject Re: How to create Index ?
Date Thu, 22 Sep 2005 19:33:34 GMT
Does anyone have created the index and stored it on a database? I have 
an application that uses jdbc, and i´m thinking if it´s possible to 
store the indexes of lucene in this database. If someone of you guys 
could help me, i appreciate....


Erik Hatcher wrote:

> Arpit - as was said below, the code is available from the Lucene in  
> Action website (URL also below).
>
>     Erik
>
>
> On Sep 22, 2005, at 2:47 PM, Arpit Sharma wrote:
>
>> Hi erik and others,
>>
>> Can you provide me the full code for Indexer program.
>> Will really appreciate it.
>>
>> THanks alot.
>>
>> --- Erik Hatcher <erik@ehatchersolutions.com> wrote:
>>
>>
>>> Arpit,
>>>
>>> It looks like you've omitted the import statements
>>> from
>>> Indexer.java.  The book omits import statements to
>>> conserve space,
>>> but they are important.  The code is provided in its
>>> entirety at
>>> http://www.lucenebook.com
>>>
>>> In fact, you could build an index by running the
>>> code directly (read
>>> the README file and follow the instructions first)
>>> by typing "ant
>>> Indexer" and following the prompts.  One of the
>>> prompts asks you
>>> where to put the index itself, and the next prompt
>>> asks for the
>>> directory of text files to index.
>>>
>>>      Erik
>>>
>>>
>>>
>>> On Sep 19, 2005, at 10:34 PM, Arpit Sharma wrote:
>>>
>>>
>>>> I have put the .jar file in C:\lucene and I have
>>>>
>>> also
>>>
>>>> unzip it and have also put all the
>>>>
>>> directories(like
>>>
>>>> analysis,index,store) in C:\ lucene.
>>>>
>>>> Now how to create a index ?
>>>> all the text files are in C:\text directory. I
>>>>
>>> have
>>>
>>>> "lucene in action" book and with the help of it I
>>>>
>>> made
>>>
>>>> a  Indexer.java program in C:\lucene and when I
>>>>
>>> tried
>>>
>>>> to compile it it is giving lot's of errors.
>>>> The code is fine(it is copy paste from the book).
>>>>
>>>> I am sure that there is some path problem. What
>>>>
>>> should
>>>
>>>> I do ?
>>>>
>>>> Thanks
>>>>
>>>> Here is the code of the Indexer.java:-
>>>> ----------------
>>>>
>>>> /** * This code was originally written for
>>>>  **   Erik's Lucene intro java.net article */
>>>>
>>>> public class Indexer {
>>>>
>>>>    public static void main(String[] args) throws
>>>> Exception {
>>>>
>>>>        if (args.length != 2) {
>>>>            throw new Exception("Usage: java " +
>>>> Indexer.class.getName()
>>>>            + " <index dir> <data dir>");
>>>>        }
>>>>
>>>>        File indexDir = new File(args[0]);
>>>>        File dataDir = new File(args[1]);
>>>>
>>>>        long start = new Date().getTime();
>>>>        int numIndexed = index(indexDir, dataDir);
>>>>        long end = new Date().getTime();
>>>>
>>>>        System.out.println("Indexing " + numIndexed
>>>>
>>> + "
>>>
>>>> files took "
>>>>        + (end - start) + " milliseconds");
>>>>
>>>>   }
>>>>
>>>>   // open an index and start file directory
>>>>
>>> traversal
>>>
>>>>
>>>>
>>>>   public static int index(File indexDir, File
>>>>
>>> dataDir)
>>>
>>>>
>>>>       throws IOException {
>>>>           if (!dataDir.exists() ||
>>>>
>>> !dataDir.isDirectory()) {
>>>
>>>>
>>>>               throw new IOException(dataDir
>>>>               + " does not exist or is not a
>>>>
>>> directory");
>>>
>>>>           }
>>>>
>>>>           IndexWriter writer = new
>>>>
>>> IndexWriter(indexDir,
>>>
>>>>
>>>>           new StandardAnalyzer(), true);
>>>>           writer.setUseCompoundFile(false);
>>>>
>>>>           indexDirectory(writer, dataDir);
>>>>
>>>>           int numIndexed = writer.docCount();
>>>>
>>>>          writer.optimize();
>>>>          writer.close();
>>>>
>>>>          return numIndexed;
>>>>      }
>>>>
>>>>      // recursive method that calls itself when it
>>>>
>>> finds
>>>
>>>> a directory
>>>>
>>>>      private static void
>>>>
>>> indexDirectory(IndexWriter
>>>
>>>> writer, File dir)
>>>>          throws IOException {
>>>>
>>>>          File[] files = dir.listFiles();
>>>>          for (int i = 0; i < files.length; i++) {
>>>>              File f = files[i];
>>>>              if (f.isDirectory()) {
>>>>                  indexDirectory(writer, f);
>>>>              } else if
>>>>
>>> (f.getName().endsWith(".txt")) {
>>>
>>>>
>>>>                indexFile(writer, f);
>>>>              }
>>>>            }
>>>>      }
>>>>
>>>>      // method to actually index a file using
>>>>
>>> Lucene
>>>
>>>>
>>>>      private static void indexFile(IndexWriter
>>>>
>>> writer,
>>>
>>>> File f)
>>>>         throws IOException {
>>>>
>>>>         if (f.isHidden() || !f.exists() ||
>>>>
>>> !f.canRead())
>>>
>>>> {
>>>>                 return;
>>>>         }
>>>>
>>>>         System.out.println("Indexing " +
>>>> f.getCanonicalPath());
>>>>
>>>>         Document doc = new Document();
>>>>         doc.add(Field.Text("contents", new
>>>> FileReader(f)));
>>>>
>>>>         doc.add(Field.Keyword("filename",
>>>> f.getCanonicalPath()));
>>>>         writer.addDocument(doc);
>>>>         }
>>>>       }
>>>>
>>>> __________________________________________________
>>>> Do You Yahoo!?
>>>> Tired of spam?  Yahoo! Mail has the best spam
>>>>
>>> protection around
>>>
>>>> http://mail.yahoo.com
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>> __________________________________________________
>> Do You Yahoo!?
>> Tired of spam?  Yahoo! Mail has the best spam protection around
>> http://mail.yahoo.com
>>
>


Mime
View raw message