lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Litchfield <...@csh.rit.edu>
Subject RE: Lucene refresh index function (incremental indexing).
Date Tue, 25 Nov 2003 16:58:07 GMT

Logging uses log4j and can be configured.  If you are having issues with
specific PDFs then you can post a bug on the sourceforge site or mail me
the PDFs directly and I will look at them.

Ben
http://www.pdfbox.org


On Tue, 25 Nov 2003, Zhou, Oliver wrote:

> I do have other problems with PDFBox-0.6.4.  For one, it has annoying debug
> information at very low level parsing process.  The other, I got infinite
> loop while indexing pdf files although they say the infinite loop bug has
> been fixed in their release notes.  Anybody knows what's going on?
>
> Thanks,
> Oliver
>
>
>
> -----Original Message-----
> From: Ben Litchfield [mailto:ben@csh.rit.edu]
> Sent: Tuesday, November 25, 2003 9:45 AM
> To: Lucene Users List
> Subject: RE: Lucene refresh index function (incremental indexing).
>
>
>
> Yes, just add the log4j configuration.  The easiest way to do that is as a
> system parameter like this
>
> java -Dlog4j.configuration=log4j.xml org.apache.lucene.demo.IndexHTML
> -create -index c:\\index ..
>
> Where log4j.xml is the path to your log4j config, PDFBox has an example
> one you can use.
>
> Ben
> http://www.pdfbox.org
>
> On Tue, 25 Nov 2003, Zhou, Oliver wrote:
>
> > Lucene doesn't have pdf parser.  In order to index pdf files you have to
> add
> > one by your self.  PDFBox is a good choice.  You may just ignore the
> warning
> > for log4j or you can add log4j in your classpath.
> >
> > Oliver
> >
> >
> > -----Original Message-----
> > From: Tun Lin [mailto:chentun@singnet.com.sg]
> > Sent: Monday, November 24, 2003 10:07 PM
> > To: 'Lucene Users List'
> > Subject: RE: Lucene refresh index function (incremental indexing).
> >
> >
> > Does it support indexing the contents of pdf files? I have found one
> project
> > called PDFBox that can be integrated with Lucene to search inside of the
> pdf
> > files. Currently, Lucene can only search for the pdf filename. I tried
> with
> > PDFBox and I got the following message when I typed the command: java
> > org.apache.lucene.demo.IndexHTML -create -index c:\\index ..
> >
> > log4j:WARN No appenders could be found for logger
> > (org.pdfbox.pdfparser.PDFParse
> > r).
> > log4j:WARN Please initialize the log4j system properly.
> >
> > Can anyone advise?
> >
> > -----Original Message-----
> > From: Doug Cutting [mailto:cutting@lucene.com]
> > Sent: Tuesday, November 25, 2003 5:01 AM
> > To: Lucene Users List
> > Subject: Re: Lucene refresh index function (incremental indexing).
> >
> > Tun Lin wrote:
> > > These are the steps I took:
> > >
> > > 1) I compile all the files in a particular directory using the command:
> > > java org.apache.lucene.demo.IndexHTML -create -index c:\\index ..
> > > , putting all the indexed files in c:\\index.
> > > 2) Everytime, I added an additional file in that directory. I need to
> > > reindex/recompile that directory to generate the indexes again. As the
> > > directory gets larger, the indexing takes a longer time.
> > >
> > > My question is how do I generate the indexes automatically everytime a
> > > new document is added in that directory without me recompiling everytime
> > manually?
> >
> > To update, try removing the '-create' from the command line.  The demo
> code
> > supports incremental updates.  It will re-scan the directory and figure
> out
> > which files have changed, what new files have appeared and which
> previously
> > existing files have been removed.
> >
> > Doug
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message