lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JT Kimbell <jtkimb...@yahoo.com>
Subject Re: Help with jump from 1.4.3 to 2.0.0
Date Wed, 20 Dec 2006 19:38:30 GMT

I figured it out.  Gopi asked me some questions that got me searching and it
turns out my JVM wasn't 1.5.06, it was 1.4.2.  I grabbed the newest version
and made it the default JVM and now I no longer have the problem.

Thanks a bunch for your help Gopi.

JT



JT Kimbell wrote:
> 
> I've sent the code your way.  I'm downloading eclipse right now so I can
> step through with its debugger once I get it all set up.  
> 
> However, I don't think I am using the same index for each of them, as this
> is all actually on 3 different machines.  Machine A has 1.4.3 and I wrote
> that code on that machine.  Machine B has 2.0.0 and I copied 1.4.3's code
> over and then 'fixed' it.  Machine C has access to the necessary text
> files, and I just FTP them to the other machines when necessary, so the
> indexes are completely independent of each other.  I just seem to get a
> null pointer exception when it reaches the August 2005 folder.  I can
> catch the exception and continue on, but then I get none of those files
> indexed, so that's ~20 less that we should have indexed.  I can't send
> anyone the actual files, but I could list the names of the files, perhaps
> that is throwing the indexer off?  Are there any special characters that
> can do that?
> 
> Also, I leave for a week-long vacation tomorrow, so I probably won't be
> able to reply or test things for a few days.
> 
> Thanks so much,
> 
> JT
> 
> 
> 
> Gopikrishnan Subramani wrote:
>> 
>> All I could suspect is perhaps you are trying to add documents to an
>> index
>> that was originally created using Lucene 1.4.3.
>> 
>> If trying to create a fresh index doesn't work, you could send me your
>> indexer code so I can take a look.
>> 
>> -Gopi
>> 
>> 
>> On 12/19/06, JT Kimbell <jtkimbell@yahoo.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> I'm working on learning Lucene for my job, and the book one of my
>>> professors
>>> purchased for myself and her is Lucene In Action, which is a good book
>>> but
>>> it is based on version 1.4.3 (I believe).  I am beginning to grasp a lot
>>> of
>>> the basic concepts behind Lucene and have a basic searching and indexing
>>> program written on the said professor's server (which is running 1.4.3).
>>> However, on my server for work I am using 2.0.0 and it was agreed that
>>> it
>>> would be best that I use the newer version.  My program ran fine using
>>> 1.4.3, but once I made a few changes to make it compatible with 2.0.0 it
>>> now
>>> returns a Null Pointer Exception about 80% of the way through.
>>>
>>> For some background on the files, they are all .txt files stored in a
>>> directory that has folders representing different years (e.g. 2005),
>>> within
>>> that there are month folders (August 2005) and those folders contain all
>>> the
>>> documents.  When I catch the exception and print while File f my program
>>> is
>>> currently on, it says it is that August 2005 folder.  My program is
>>> exactly
>>> the same except for updating Field to be compatible with 2.0.0 and the
>>> data
>>> is an exact copy of the other data.
>>>
>>> So I suppose I have two questions:
>>>
>>> 1)  The relevant methods from the two programs are below, does anyone
>>> have
>>> any ideas why this isn't working, am I doing something wrong or assuming
>>> something I shouldn't?  (If you need to see the full code with all
>>> comments
>>> for either program, let me know).
>>>
>>> 2)  Is there a good tutorial or something online for version 2.0.0 just
>>> to
>>> help me understand it better?  Do you have any tips?
>>>
>>> Version 1.4.3 Code
>>>        //This method recursively calls itself when it finds a directory
>>>        public void indexDirectory(IndexWriter writer, File dir) throws
>>> IOException{
>>>                File[] files = dir.listFiles();
>>>
>>>                for(int i = 0; i < files.length; i++){
>>>                        File f = files[i];
>>>                        if (f.isDirectory()){
>>>                                indexDirectory(writer, f);
>>>                        }else if (f.getName().endsWith(".txt")){
>>>                                indexFile(writer, f);
>>> }
>>>                }
>>>        }
>>>
>>>        //This method indexes each individual file
>>>        public void indexFile(IndexWriter writer, File f) throws
>>> IOException{
>>>
>>>                if(f.isHidden() || !f.exists() || !f.canRead()){
>>>                        return;
>>>                }
>>>
>>>                Document doc = new Document();
>>>                doc.add(Field.Text("contents", new FileReader(f)));
>>>                doc.add(Field.Keyword("filename", f.getCanonicalPath()));
>>>                writer.addDocument(doc);
>>>        }
>>>
>>> Version 2.0.0 Code
>>>        //This method recursively calls itself when it finds a directory
>>>        public void indexDirectory(IndexWriter writer, File dir) throws
>>> IOException{
>>>                File[] files = dir.listFiles();
>>>
>>>                for(int i = 0; i < files.length; i++){
>>>                        File f = files[i];
>>>                        try{
>>>                             if (f.isDirectory()){
>>>                                   indexDirectory(writer, f);
>>>                       }else if (f.getName().endsWith(".txt")){  //Seems
>>> this is where it is first thrown...
>>>                                   indexFile(writer, f);
>>>                                   System.out.println(f);
>>>                             }
>>>                        }catch(NullPointerException npe){
>>>                                npe.printStackTrace(System.out);
>>>                                System.out.println("File is: " + f);
>>>                        }
>>>                }
>>>        }
>>>
>>>        //This method indexes each individual file
>>>        public void indexFile(IndexWriter writer, File f) throws
>>> IOException{
>>>
>>>                if(f.isHidden() || !f.exists() || !f.canRead()){
>>>                        return;
>>>                }
>>>
>>>                Document doc = new Document();
>>>                doc.add(new Field("contents", new FileReader(f)));
>>>                doc.add(new Field("filename", f.getCanonicalPath(),
>>> Field.Store.YES, Field.Index.UN_TOKENIZED));
>>>                writer.addDocument(doc);
>>>        }
>>>
>>> Thanks so much for any help you can give me.  It seems strange to me
>>> that
>>> when I print File f, it prints out a directory name (August 2005), but
>>> got
>>> past the isDirectory statement and is now checking to see if it has a
>>> .txt
>>> extension.
>>>
>>> Thanks,
>>>
>>> JT
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Help-with-jump-from-1.4.3-to-2.0.0-tf2846591.html#a7949145
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Help-with-jump-from-1.4.3-to-2.0.0-tf2846591.html#a7996431
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message