lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Durga.Tirunag...@Sun.COM
Subject Questions Lucene
Date Mon, 10 Sep 2007 23:56:42 GMT

Dear Gurus
   
       We are evaluating the possibliity of using Lucene for our 
Searching needs. I have few questions before we embark on this project 
of evaluating Lucene

        1) What are the various languages supported by Lucene.? Looks 
like its able to handle only English . We are trying to see if it works 
with Japanese / Chinese and other characters
             Can some one answer

        2) After Lucene indexes a given data set, how does Lucene handle 
incremental / dymanic change in the data. In other words, our data keeps 
changing ; how
            does Lucene handle this changing data. Does it re-index 
every new file entering this data set ?. Or Does it do it index the data 
in increments ?

       3) How does Lucene handle deleted files from a particular data 
set ?. What we are concerned is that, does Lucene automatically figure 
out if a particular file is deleted from the data set ?.
          and it immediately removes the index to that particular file ?
      
       4) Please consider the following Scenario. When Lucene is given 
the following files to Index.

          a) Files under /xyz/abc ( Say x.txt, y.txt, a.txt, b.txt, 
c.txt etc.. )
         
          b) Files under /def/ghi ( Say none.txt, dude.txt, hello.txt 
etc.. )
   
          So after Lucene finished indexing these file under these two 
directories. And a subsequent search for say a "key word" in hello.txt 
is made
          What does Lucene return; does it return i.e the fully 
qualified location of this file ? /def/ghi/hello.txt
      
       5) How does Lucene index a particular set of files. I.e *based* 
on key words ?. Based on sentences ? Based on what criterion ?

       6) is Lucene multi-threaded ?. For example if Lucene is indexing 
a set of files in a given data set, and for example if there is a Huge 
file ( 2 GB file ). Does Lucene index this file in parts (i.e parallely 
            i.e in multi-threaded fashion ? or does it index this file 
sequentially

      7) Also if a data set has multiple files, does Lucene process each 
file seperately in a different thread ? or does it do it sequentially

      8) Does lucene index only text files ?. We have few data bases is 
it possible for us to Index the data in these data bases ?

      9) Are there any performance Bench Marks for Lucene

Thanks a lot in Advance!!

-- Thanks
Durga Deep Tirunagari
Santaclara, CA

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: This email message is for the sole use of the intended recipient(s) and may contain
confidential and privileged information. Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by reply
email
and destroy all copies of the original message.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message