cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hank Heidt" <hhe...@dsbox.com>
Subject RE: ms word xml and embedded images
Date Thu, 21 Apr 2005 21:27:39 GMT


It looks like you are right and equations are stored as a base64 encoded
Windows Meta file. 

I created an equation in Word and then looked at the XML. The base64
string starts with .wmz

	<w:binData w:name="wordml://08000001.wmz">
 
While for an embedded picture it starts with .png

	w:name="wordml://03000002.png">

Oh well, this is something that I'll have to keep in mind if our users
ever want to use the formula editor.

-Hank

-----Original Message-----
From: Stavros Kounis [mailto:skounis@gmail.com] 
Sent: Thursday, April 21, 2005 4:41 PM
To: Hank Heidt
Subject: Re: ms word xml and embedded images

On 4/21/05, Hank Heidt <hheidt@dsbox.com> wrote:
> 
> Are you sure that the embedded equation image is being stored as a
> base64 encoded metafile?
> 
> I believe that Word 2003 XML instead serializes images (or at least
most
> images) as base64 encoded pngs.

i'm already doing this in a .NET application
the steps i follow are:
1. get the string
2. decode it as base64
3. handle the result as windows meta file 

but i have not try to handle it as png 
i'll give a try

> 
> For a document reporting application, I can successfully extract Word
> images and convert them to jpg's by using the extractor transformer
and
> SVG serializer. I have not tried this with embedded equation images.
> 
> In the extractor pipeline I convert an extracted Word <pict> element
to
> an SVG element buy using the following XSL:
> 
> <xsl:template match="/">
> <xsl:variable name="style" select="descendant::v:shape/@style"/>
> <xsl:variable name="width"
> select="substring-before(substring-after($style, 'width:'),';')"/>
> <xsl:variable name="height" select="substring-after($style,
> 'height:')"/>
> 
> <svg xmlns="http://www.w3.org/2000/svg">
> <xsl:attribute name="width"><xsl:value-of
> select="$width"/></xsl:attribute>
> <xsl:attribute name="height"><xsl:value-of
> select="$height"/></xsl:attribute>
> <title>Embedded Word 2003 Image</title>
> 
> <g>
> <image xmlns:xlink="http://www.w3.org/1999/xlink">
> <xsl:attribute name="width"><xsl:value-of
> select="$width"/></xsl:attribute>
> <xsl:attribute name="height"><xsl:value-of
> select="$height"/></xsl:attribute>
> <xsl:attribute name="xlink:href">data:image/png;base64,<xsl:value-of
> select="descendant::w:binData/."/></xsl:attribute>
> </image>
> </g>
> 
> </svg>
> 
> The created SVG element contains the original Word base64 "w:binData"
> string as an href attribute of a <svg:g
> xlink:href="data:image/png;base64,binData STRING GOES HERE..."> tag.
> 
> -Hank
> 
> -----Original Message-----
> From: Stavros Kounis [mailto:skounis@gmail.com]
> Sent: Thursday, April 21, 2005 9:15 AM
> To: users@cocoon.apache.org
> Subject: ms word xml and embedded images
> 
> hi all
> 
> i want to publish xml files that are produced from microsoft's word
> (save as xml)
> 
> the problem i have to solve is related with embedded equation images
> 
> when the document have equation symbol, the produced xml has a base64
> string that
> contains a gziped windows metafile. the way to get this windows
metafile
> is:
> 
> 1. base64 decoding of the string
> 2. gunzip the result (1)
> 
> did you know if anyone has allready do some work on this way?
> 
> if not WDYT is the best approach?
> 
> i'm thinking about a custom generator that parse the .xml doc for
> embended images string
> and save them (here of course i don't know if i'm able to convert
> widnows metafile to .png or gif) to disk ?
> 
> i will appreciate any hint
> 
> regard
> 
> stavros
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message