cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fred Vos <f...@fredvos.org>
Subject Dynamically generating arabic texts with svg2png serializer
Date Tue, 14 Feb 2006 22:51:35 GMT
Hello,

Last september I started a course in Arabic language. Now I want to present
Arabic texts on a website, using Cocoon. Normal Arabic text is supported in
both Mozilla and Konqueror under Linux without a problem. Using entity encoded
unicode strings like &#x0628;&#x062d;&#x0631; , browsers will present Arabic
characters Beh, Hah and Reh from right to left. No problem.

But for beginners the Arabic language supports vocals like the Fatha, Damma or
Kasra, making it easier tho understand how one must pronounce the texts. You
can find these signs in the Unicode table as combining characters. To present
the above word as BaHRoenn (=sea), you can add combining characters Fatha,
Sukun and Dammatan: &#x0628;&#x064e;&#x062d;&#x0652;&#x0631;&#x064c;

Try this under Mozilla or Konqueror and a strange thing happens: it is
presented from left-to-right and gets unreadable, even for an Arab. Don't know
if IE does this right.

The only renderer that seems to work here is Batik. If I enter the above text
in an SVG file and convert it into a PNG file with the Batik rasterizer
(command line interface), it is presented correctly, from right to left and
with the combining characters.

Now my plan is as follows. I enter my texts including the combining characters
in an XML file and transform these texts by removing the forbidden
characters. I use the following XSL/XPath construct to remove the combining
characters:

<xsl:for-each select="str:tokenize(string(@ar),
'&#x064c;&#x064e;&#x064f;&#x0650;&#x0651;&#x0652;')">
  <xsl:value-of select="." />
</xsl:for-each>

(where @ar contains the string to convert)

This gives me Arab text without the vowels. Any browser will present this text
nice from right to left. To present the text with vowels I
want to convert the texts using an dynamically generated SVG file and the
svg2png serializer.

For western texts, things are easy. Using a basic SVG file for the generator,
I can transform this document with an XSL transformer, using the wildcard in
the matcher as a parameter to the transformer. The transformer adds the
parameter as text. This creates the SVG document including the text. Using the
svg2png serializer, I can get a PNG document containing my dynamic text.

Unfortunately this doesn't work for Arabic text, even without the combining
characters.

Here's the matcher in the sitemap:

      <map:match pattern="arab/artrans-*">
        <map:generate type="file" src="style/artrans.svg"/>
        <map:transform type="xslt" src="style/artranssvg.xsl">
          <map:parameter name="text" value="{1}"/>
        </map:transform>
        <map:serialize type="svg2png"/>
      </map:match>

If I try to use http://host:port/.../arab/artrans-<arab text for BaHRoenn
without vowels is pasted here> in my browser (mozilla), the url is converted
into http://host:port/.../arab/artrans-%D8%A8%D8%AD%D8%B1 and the picture
contains rubbish text.

Does anyone here have any idea how I can successfully use the Batik rasterizer
in the Cocoon environment for dynamically generating PNG or JPEG pictures with
Arabic texts?

Thanks in advance for your attention.

Fred

-- 
|E  R
| D  F
|
|fred at fredvos dot org
|5235 DG 52 NL +31 73 6411833

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message