poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cédric Bosdonnat <cedric.bosdonnat....@free.fr>
Subject Re: Find start and finish point in HWPFDocument bytes.
Date Fri, 24 Sep 2010 08:17:10 GMT
Hi Zachary,

On Fri, 2010-09-24 at 15:39 +0930, Zachary Mitchell wrote:
> It looks like I was looking at the wrong HWPFDocument byte [] after all.
> 
> I have a demo HWPFDocument file,
> which is read from a Word file that has two gif
> images inserted and embedded.
> 
> I have been told that the bytes for the images
> inside the doc file, the HWPFDocument file
> I am programming with, starts at 
> Character point 0x01
> byte 01.

It seems you missunderstood me with that 0x01 character. In the Document
stream of a word file you can find all the text (starting from
fib.fcMin). In these text data, the pictures are marked by a 0x01
character.

The trick there is that all 0x01 aren't necessary pictures: it also need
the fcPicLoc SPRM (see the Table stream and the CHP plex). The fcPicLoc
then provides the offset to the beginning of the PICF structure in the
data stream.

I am too lazy to make a code snippet, but this shouldn't be too hard to
get all the pictures from a doc file using HWPFDocument...

Regards,

-- 
Cédric Bosdonnat
Go-oo hacker
http://go-oo.org
OOo Eclipse Integration developer
http://cedric.bosdonnat.free.fr


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message