poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MSB <markbrd...@tiscali.co.uk>
Subject Re: how to read comments ( Annotation ) with poi in word document ( 97/2000/xp/2003 )
Date Thu, 16 Jul 2009 07:18:43 GMT

In that case, I am not at all sure you can extract just the comments. Looking
through the javadoc for the HWPF stream, there is no obvious method to call
that returns just the comments. If ti were me, I would simply get all of the
text from a Word document using HWPF and see if the comments are returned
along with the paragraph text. If they are, then you could dig around more
to see if it is possible to differentiate the comments from the paragraph
text in some way.

The easiest way to get at the text for the document would be to make use of
an instance of the org.apache.poi.hwpf.extractor.WordExtractor class. It has
two methods - getText() and getParagraphText() - that you could use to
examine the documents contents, a little like this;

WordExtractor extractor = new WordExtractor(new FileInputStream(new
File("Your file name")));
String[] paraText = extractor.getParagraphText();
for(int i = 0; i < paraText.length; i++) {

then, if you see the comments, start to dig around more to discover whether
it is possible to identify them specifically.


Mark B

bihag wrote:
> I am mainly targeting OLE2CDF (.doc) documents ...
> MSB wrote:
>> Can I ask which sort of documents you are targetting please - OLE2CDF or
>> OpenXML? I think that it ought to be possible to recover the comments for
>> an OpenXML based file (.docx) but am not sure that this is the case for a
>> binary (.doc) OLE2CDF file.
>> Yours
>> Mark B
>> bihag wrote:
>>> Hi,
>>> I want to read all the comment (Annotation) which are there is word
>>> document.
>>> Please provide some example code to read the comments ... 
>>> Appreciate you help ...

View this message in context: http://www.nabble.com/how-to-read-comments-%28-Annotation-%29-with-poi-in-word-document-%28-97-2000-xp-2003-%29-tp24510523p24511429.html
Sent from the POI - Dev mailing list archive at Nabble.com.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message