Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (herse.apache.org: domain of lists@nabble.com designates
 72.21.53.35 as permitted sender)
Message-ID: <7991979.post@talk.nabble.com>
Date: Wed, 20 Dec 2006 07:29:27 -0800 (PST)
From: JT Kimbell <jtkimbell@yahoo.com>
To: java-user@lucene.apache.org
Subject: Re: Help with jump from 1.4.3 to 2.0.0
In-Reply-To: <733777220612200539l5bf69df9udbd503d899201c0b@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
References: <7949145.post@talk.nabble.com>
 <733777220612200539l5bf69df9udbd503d899201c0b@mail.gmail.com>


I've sent the code your way.  I'm downloading eclipse right now so I can step
through with its debugger once I get it all set up.  

However, I don't think I am using the same index for each of them, as this
is all actually on 3 different machines.  Machine A has 1.4.3 and I wrote
that code on that machine.  Machine B has 2.0.0 and I copied 1.4.3's code
over and then 'fixed' it.  Machine C has access to the necessary text files,
and I just FTP them to the other machines when necessary, so the indexes are
completely independent of each other.  I just seem to get a null pointer
exception when it reaches the August 2005 folder.  I can catch the exception
and continue on, but then I get none of those files indexed, so that's ~20
less that we should have indexed.  I can't send anyone the actual files, but
I could list the names of the files, perhaps that is throwing the indexer
off?  Are there any special characters that can do that?

Also, I leave for a week-long vacation tomorrow, so I probably won't be able
to reply or test things for a few days.

Thanks so much,

JT


Gopikrishnan Subramani wrote:
> 
> All I could suspect is perhaps you are trying to add documents to an index
> that was originally created using Lucene 1.4.3.
> 
> If trying to create a fresh index doesn't work, you could send me your
> indexer code so I can take a look.
> 
> -Gopi
> 
> 
> On 12/19/06, JT Kimbell <jtkimbell@yahoo.com> wrote:
>>
>>
>> Hi,
>>
>> I'm working on learning Lucene for my job, and the book one of my
>> professors
>> purchased for myself and her is Lucene In Action, which is a good book
>> but
>> it is based on version 1.4.3 (I believe).  I am beginning to grasp a lot
>> of
>> the basic concepts behind Lucene and have a basic searching and indexing
>> program written on the said professor's server (which is running 1.4.3).
>> However, on my server for work I am using 2.0.0 and it was agreed that it
>> would be best that I use the newer version.  My program ran fine using
>> 1.4.3, but once I made a few changes to make it compatible with 2.0.0 it
>> now
>> returns a Null Pointer Exception about 80% of the way through.
>>
>> For some background on the files, they are all .txt files stored in a
>> directory that has folders representing different years (e.g. 2005),
>> within
>> that there are month folders (August 2005) and those folders contain all
>> the
>> documents.  When I catch the exception and print while File f my program
>> is
>> currently on, it says it is that August 2005 folder.  My program is
>> exactly
>> the same except for updating Field to be compatible with 2.0.0 and the
>> data
>> is an exact copy of the other data.
>>
>> So I suppose I have two questions:
>>
>> 1)  The relevant methods from the two programs are below, does anyone
>> have
>> any ideas why this isn't working, am I doing something wrong or assuming
>> something I shouldn't?  (If you need to see the full code with all
>> comments
>> for either program, let me know).
>>
>> 2)  Is there a good tutorial or something online for version 2.0.0 just
>> to
>> help me understand it better?  Do you have any tips?
>>
>> Version 1.4.3 Code
>>        //This method recursively calls itself when it finds a directory
>>        public void indexDirectory(IndexWriter writer, File dir) throws
>> IOException{
>>                File[] files = dir.listFiles();
>>
>>                for(int i = 0; i < files.length; i++){
>>                        File f = files[i];
>>                        if (f.isDirectory()){
>>                                indexDirectory(writer, f);
>>                        }else if (f.getName().endsWith(".txt")){
>>                                indexFile(writer, f);
>> }
>>                }
>>        }
>>
>>        //This method indexes each individual file
>>        public void indexFile(IndexWriter writer, File f) throws
>> IOException{
>>
>>                if(f.isHidden() || !f.exists() || !f.canRead()){
>>                        return;
>>                }
>>
>>                Document doc = new Document();
>>                doc.add(Field.Text("contents", new FileReader(f)));
>>                doc.add(Field.Keyword("filename", f.getCanonicalPath()));
>>                writer.addDocument(doc);
>>        }
>>
>> Version 2.0.0 Code
>>        //This method recursively calls itself when it finds a directory
>>        public void indexDirectory(IndexWriter writer, File dir) throws
>> IOException{
>>                File[] files = dir.listFiles();
>>
>>                for(int i = 0; i < files.length; i++){
>>                        File f = files[i];
>>                        try{
>>                             if (f.isDirectory()){
>>                                   indexDirectory(writer, f);
>>                       }else if (f.getName().endsWith(".txt")){  //Seems
>> this is where it is first thrown...
>>                                   indexFile(writer, f);
>>                                   System.out.println(f);
>>                             }
>>                        }catch(NullPointerException npe){
>>                                npe.printStackTrace(System.out);
>>                                System.out.println("File is: " + f);
>>                        }
>>                }
>>        }
>>
>>        //This method indexes each individual file
>>        public void indexFile(IndexWriter writer, File f) throws
>> IOException{
>>
>>                if(f.isHidden() || !f.exists() || !f.canRead()){
>>                        return;
>>                }
>>
>>                Document doc = new Document();
>>                doc.add(new Field("contents", new FileReader(f)));
>>                doc.add(new Field("filename", f.getCanonicalPath(),
>> Field.Store.YES, Field.Index.UN_TOKENIZED));
>>                writer.addDocument(doc);
>>        }
>>
>> Thanks so much for any help you can give me.  It seems strange to me that
>> when I print File f, it prints out a directory name (August 2005), but
>> got
>> past the isDirectory statement and is now checking to see if it has a
>> .txt
>> extension.
>>
>> Thanks,
>>
>> JT
>> --
>> View this message in context:
>> http://www.nabble.com/Help-with-jump-from-1.4.3-to-2.0.0-tf2846591.html#a7949145
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Help-with-jump-from-1.4.3-to-2.0.0-tf2846591.html#a7991979
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org