lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mano dasanayaka <mmcd2...@yahoo.com>
Subject Re: Problem of indexing pdf files
Date Mon, 12 Sep 2005 04:25:06 GMT
Hi ,
 
If you are using lucene to index pdf files actually it won't work .But  ther's an on going
project within Sourceforge  with relate to  content search called "docSearcher"  .docSearcher
supports indexing pdf, and allother MS format files except ppt files..So i think you better
to have a look into it, and the most important thing is that docSearcher is built using lucene
..
 
And the warnings that you have mentioned...are common..you have to append a looger for logings..and
initialize the property file for log4j..
 
 
 
Best Regards,
Mano
 
tirupathi reddy <tirupathireddy_m@yahoo.com> wrote:
Hello,

I am getting the following warning message when I am indexing the pdf files using Lucene Indexing.

log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParser).
log4j:WARN Please initialize the log4j system properly.

This is the code I am using:

if(pdf.exists())
{
String text = "";
try{ 
PDDocument document = PDDocument.load(pdf); // laden des Files 

PDFTextStripper pts = new PDFTextStripper(); //Extrahieren des Textes 
text = pts.getText(document); 
document.close();
} 
catch(IOException e){ 
System.out.println("File not found"); 
}
mDocument.add(Field.Text("fulltext", text));


thanx,
MTREDDY




Tirupati Reddy Manyam 
24-06-08, 
Sundugaullee-24, 
79110 Freiburg 
GERMANY. 

Phone: 00497618811257 
cell : 004917624649007

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

mmcd
__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message