poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Ilardo" <alessandro.ila...@my-sandro.net>
Subject wordExtractor cannot be found to compile with javac
Date Thu, 05 Jan 2006 23:42:26 GMT
Hello there,
I'm new with POI, and i want use it to integrate the word files in =
Lucene.

I know that something suitable for my purpose should be on =
http://www.textmining.org/ but everytime I try to open that adress I =
just get this message:
 Hacked Fotolog ? ITALY owned .org ? HACKED BY ITALY=20

and so, I was not able to check out that package.

In anycase, I tried to download all three latest .jar files from =
http://encore.torchbox.com/poi-cvs-build/
and compile a test class in order to test POI with Lucene.

It's not able to compile because it doesn't find the WordExtractor =
class.

C:\Documents and Settings\Alessandro\Desktop\Copia di =
PDFBox-0.7.2\PDFBox-0.7.2>
javac -classpath =
lucene-1.4.3.jar;lucene-demos-1.4.3.jar;PDFBox-0.7.2.jar;poi-3.
0.jar;poi-scratchpad-3.0.jar =
src\org\pdfbox\searchengine\lucene\luceneWORDDocume
nt.java
src\org\pdfbox\searchengine\lucene\luceneWORDDocument.java:294: cannot =
resolve s
ymbol
symbol  : class WordExtractor
location: class org.pdfbox.searchengine.lucene.LuceneWORDDocument
            WordExtractor extractor =3D new WordExtractor();
            ^
src\org\pdfbox\searchengine\lucene\luceneWORDDocument.java:294: cannot =
resolve s
ymbol
symbol  : class WordExtractor
location: class org.pdfbox.searchengine.lucene.LuceneWORDDocument
            WordExtractor extractor =3D new WordExtractor();
                                          ^
2 errors



This's the code used
HWPFDocument wdoc =3D new HWPFDocument(is);
     WordExtractor extractor =3D new WordExtractor();
     String contents =3D extractor.extractText(wdoc);
     StringReader reader =3D new StringReader( contents );
            document.add( Field.Text( "contents", reader ) );

I'll apreciate any comments or suggestions to solve my problem.
Thanks in advance
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message