poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 35045] New: - Extracting text from word files fails
Date Tue, 24 May 2005 17:01:11 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=35045>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=35045

           Summary: Extracting text from word files fails
           Product: POI
           Version: 2.5
          Platform: PC
        OS/Version: Windows 2000
            Status: NEW
          Severity: critical
          Priority: P2
         Component: POI Overall
        AssignedTo: poi-dev@jakarta.apache.org
        ReportedBy: RobertEberhardt@gmx.de


Hello

I am trying to use poi to extract the text of some word documents with the
following code
StringWriter writer = new StringWriter();
WordDocument doc = new WordDocument("C:\\arj\\pdf\\peer.doc");
doc.openDoc();
doc.writeAllText(writer);
System.out.println(writer.toString());
some word files respond with the following exception
java.lang.NullPointerException
	at org.apache.poi.hdf.extractor.Utils.convertBytesToShort(Utils.java:47)
	at org.apache.poi.hdf.extractor.StyleSheet.doCHPOperation(StyleSheet.java:176)
	at org.apache.poi.hdf.extractor.StyleSheet.uncompressProperty(StyleSheet.java:685)
	at org.apache.poi.hdf.extractor.StyleSheet.uncompressProperty(StyleSheet.java:565)
	at
org.apache.poi.hdf.extractor.WordDocument.addParagraphContent(WordDocument.java:1050)
	at org.apache.poi.hdf.extractor.WordDocument.createParagraph(WordDocument.java:942)
	at org.apache.poi.hdf.extractor.WordDocument.addBlockContent(WordDocument.java:876)
	at org.apache.poi.hdf.extractor.WordDocument.writeSection(WordDocument.java:681)
	at org.apache.poi.hdf.extractor.WordDocument.<init>(WordDocument.java:211)
	at org.apache.poi.hdf.extractor.WordDocument.<init>(WordDocument.java:186)
	at zb.sts.text.WordTester.main(WordTester.java:27)
Exception in thread "main" 


The text of other word files is not completely extracted

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


Mime
View raw message