lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe <...@blowfish.com>
Subject Re: How to create Index ?
Date Thu, 22 Sep 2005 21:04:42 GMT
You could certainly subclass org.apache.lucene.store.Directory to  
create a new concrete class that stored the data in a dbms; in fact,  
the Javadoc for Directory specifically mentions that as a  
possibility.  You could, for example, map "files" in JDBCDirectory (a  
hypothetical class) to a table in the dbms with a text field for the  
name of the "file" and a BLOB as the content (you probably need a  
modification timestamp field, too).  I didn't end up doing it, but it  
does not look particularly difficult.

On 22 Sep 2005, at 13:54, Fernando Luiz Engelmann Junior wrote:

> I have a portal application, installed on the server. I want to  
> store the index in the dbms, cause all the data would be  
> centralized in just one place(Oracle or mysql for example). So when  
> i need to do a backup, or move my site to another server, the  
> impact would be smaller then if i have the index in one place 
> (filesystem) and data on another(dbms). Besides, my point of view  
> is, if i could store all the information in the dbms, i wont be  
> have any headache with security roles or something like that.
>
>
>
> Christophe wrote:
>
>
>> Hi,
>>
>> (First time poster!)
>>
>> I considered that when working on my application, but I couldn't   
>> figure out a reason that it would be an advantage over plain flat   
>> files.  The only possible advantage I could see was distribution  
>> (you  could update the index in one place and have all the dbms  
>> clients get  copies), but I decided to solve that with an RMI  
>> solution (a la  Lucene in Action's examples).  What kind of  
>> functionality were you  looking to gain from storing the indexes  
>> in the dbms?
>>
>> On 22 Sep 2005, at 12:33, Fernando Luiz Engelmann Junior wrote:
>>
>>
>>> Does anyone have created the index and stored it on a database?  
>>> I  have an application that uses jdbc, and i´m thinking if it´s   
>>> possible to store the indexes of lucene in this database. If   
>>> someone of you guys could help me, i appreciate....
>>>
>>>
>>> Erik Hatcher wrote:
>>>
>>>
>>>
>>>> Arpit - as was said below, the code is available from the  
>>>> Lucene  in  Action website (URL also below).
>>>>
>>>>     Erik
>>>>
>>>>
>>>> On Sep 22, 2005, at 2:47 PM, Arpit Sharma wrote:
>>>>
>>>>
>>>>
>>>>> Hi erik and others,
>>>>>
>>>>> Can you provide me the full code for Indexer program.
>>>>> Will really appreciate it.
>>>>>
>>>>> THanks alot.
>>>>>
>>>>> --- Erik Hatcher <erik@ehatchersolutions.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Arpit,
>>>>>>
>>>>>> It looks like you've omitted the import statements
>>>>>> from
>>>>>> Indexer.java.  The book omits import statements to
>>>>>> conserve space,
>>>>>> but they are important.  The code is provided in its
>>>>>> entirety at
>>>>>> http://www.lucenebook.com
>>>>>>
>>>>>> In fact, you could build an index by running the
>>>>>> code directly (read
>>>>>> the README file and follow the instructions first)
>>>>>> by typing "ant
>>>>>> Indexer" and following the prompts.  One of the
>>>>>> prompts asks you
>>>>>> where to put the index itself, and the next prompt
>>>>>> asks for the
>>>>>> directory of text files to index.
>>>>>>
>>>>>>      Erik
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sep 19, 2005, at 10:34 PM, Arpit Sharma wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I have put the .jar file in C:\lucene and I have
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> also
>>>>>>
>>>>>>
>>>>>>
>>>>>>> unzip it and have also put all the
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> directories(like
>>>>>>
>>>>>>
>>>>>>
>>>>>>> analysis,index,store) in C:\ lucene.
>>>>>>>
>>>>>>> Now how to create a index ?
>>>>>>> all the text files are in C:\text directory. I
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> have
>>>>>>
>>>>>>
>>>>>>
>>>>>>> "lucene in action" book and with the help of it I
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> made
>>>>>>
>>>>>>
>>>>>>
>>>>>>> a  Indexer.java program in C:\lucene and when I
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> tried
>>>>>>
>>>>>>
>>>>>>
>>>>>>> to compile it it is giving lot's of errors.
>>>>>>> The code is fine(it is copy paste from the book).
>>>>>>>
>>>>>>> I am sure that there is some path problem. What
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> should
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I do ?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Here is the code of the Indexer.java:-
>>>>>>> ----------------
>>>>>>>
>>>>>>> /** * This code was originally written for
>>>>>>>  **   Erik's Lucene intro java.net article */
>>>>>>>
>>>>>>> public class Indexer {
>>>>>>>
>>>>>>>    public static void main(String[] args) throws
>>>>>>> Exception {
>>>>>>>
>>>>>>>        if (args.length != 2) {
>>>>>>>            throw new Exception("Usage: java " +
>>>>>>> Indexer.class.getName()
>>>>>>>            + " <index dir> <data dir>");
>>>>>>>        }
>>>>>>>
>>>>>>>        File indexDir = new File(args[0]);
>>>>>>>        File dataDir = new File(args[1]);
>>>>>>>
>>>>>>>        long start = new Date().getTime();
>>>>>>>        int numIndexed = index(indexDir, dataDir);
>>>>>>>        long end = new Date().getTime();
>>>>>>>
>>>>>>>        System.out.println("Indexing " + numIndexed
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> + "
>>>>>>
>>>>>>
>>>>>>
>>>>>>> files took "
>>>>>>>        + (end - start) + " milliseconds");
>>>>>>>
>>>>>>>   }
>>>>>>>
>>>>>>>   // open an index and start file directory
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> traversal
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   public static int index(File indexDir, File
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> dataDir)
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>       throws IOException {
>>>>>>>           if (!dataDir.exists() ||
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> !dataDir.isDirectory()) {
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>               throw new IOException(dataDir
>>>>>>>               + " does not exist or is not a
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> directory");
>>>>>>
>>>>>>
>>>>>>
>>>>>>>           }
>>>>>>>
>>>>>>>           IndexWriter writer = new
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> IndexWriter(indexDir,
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>           new StandardAnalyzer(), true);
>>>>>>>           writer.setUseCompoundFile(false);
>>>>>>>
>>>>>>>           indexDirectory(writer, dataDir);
>>>>>>>
>>>>>>>           int numIndexed = writer.docCount();
>>>>>>>
>>>>>>>          writer.optimize();
>>>>>>>          writer.close();
>>>>>>>
>>>>>>>          return numIndexed;
>>>>>>>      }
>>>>>>>
>>>>>>>      // recursive method that calls itself when it
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> finds
>>>>>>
>>>>>>
>>>>>>
>>>>>>> a directory
>>>>>>>
>>>>>>>      private static void
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> indexDirectory(IndexWriter
>>>>>>
>>>>>>
>>>>>>
>>>>>>> writer, File dir)
>>>>>>>          throws IOException {
>>>>>>>
>>>>>>>          File[] files = dir.listFiles();
>>>>>>>          for (int i = 0; i < files.length; i++) {
>>>>>>>              File f = files[i];
>>>>>>>              if (f.isDirectory()) {
>>>>>>>                  indexDirectory(writer, f);
>>>>>>>              } else if
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> (f.getName().endsWith(".txt")) {
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>                indexFile(writer, f);
>>>>>>>              }
>>>>>>>            }
>>>>>>>      }
>>>>>>>
>>>>>>>      // method to actually index a file using
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Lucene
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>      private static void indexFile(IndexWriter
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> writer,
>>>>>>
>>>>>>
>>>>>>
>>>>>>> File f)
>>>>>>>         throws IOException {
>>>>>>>
>>>>>>>         if (f.isHidden() || !f.exists() ||
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> !f.canRead())
>>>>>>
>>>>>>
>>>>>>
>>>>>>> {
>>>>>>>                 return;
>>>>>>>         }
>>>>>>>
>>>>>>>         System.out.println("Indexing " +
>>>>>>> f.getCanonicalPath());
>>>>>>>
>>>>>>>         Document doc = new Document();
>>>>>>>         doc.add(Field.Text("contents", new
>>>>>>> FileReader(f)));
>>>>>>>
>>>>>>>         doc.add(Field.Keyword("filename",
>>>>>>> f.getCanonicalPath()));
>>>>>>>         writer.addDocument(doc);
>>>>>>>         }
>>>>>>>       }
>>>>>>>
>>>>>>> __________________________________________________
>>>>>>> Do You Yahoo!?
>>>>>>> Tired of spam?  Yahoo! Mail has the best spam
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> protection around
>>>>>>
>>>>>>
>>>>>>
>>>>>>> http://mail.yahoo.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> __________________________________________________
>>>>> Do You Yahoo!?
>>>>> Tired of spam?  Yahoo! Mail has the best spam protection around
>>>>> http://mail.yahoo.com
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>


Mime
View raw message