pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hamed Iravanchi <iravan...@gmail.com>
Subject Re: Help needed to resolve issue with converting Arabic characters to presentation forms
Date Wed, 29 Feb 2012 08:49:14 GMT
Hi Andreas,

Regarding the glyph-drawing issue, since I didn't hear anything from you I
decided to take a shot myself, so I checked out the code (1.6 release tag)
and started modifying it to see if I can get the result I expect, but I am
confused and need help :)

I managed to convert the sample PDF that I provided to image correctly, but
I made almost everything else corrupt! Here's what I did:

I added a "drawGlyph" to PDFont, next to "drawString" like this:

    public abstract void drawString( String string, Graphics g, float
fontSize,
        AffineTransform at, float x, float y ) throws IOException;

    public abstract void drawGlyph(int[] codeString, Graphics g, float
fontSize,
                                   AffineTransform at, float x, float y)
throws IOException;

I tried to use the codes extracted from page stream. In the PDFStreamEngine
-> processEncodedText -> for loop -> when "font.encode" succeeds, I use the
same code integer to draw glyphs, and I passed it along the string to
"processTextPosition" and I called "drawGlyph" in it, instead of
"drawString".

Here's the drawGlyph code that I wrote, according to your guidance:

    @Override
    public void drawGlyph(int[] codeString, Graphics g, float fontSize,
AffineTransform at, float x, float y)
            throws IOException
    {
        Font _awtFont = getawtFont();
        Graphics2D g2d = (Graphics2D)g;
        g2d.setRenderingHint(RenderingHints.KEY_ANTIALIASING,
RenderingHints.VALUE_ANTIALIAS_ON);
        writeFont(g2d, at, _awtFont, x, y, codeString);
    }


Which uses an overload of writeFont similar to the original:


    protected void writeFont(final Graphics2D g2d, final AffineTransform
at, final Font awtFont,
                             final float x, final float y, final int[]
codeString)
    {
        FontRenderContext frc = new FontRenderContext(null, true, true);

        // check if we have a rotation
        if (!at.isIdentity())
        {
            try
            {
                AffineTransform atInv = at.createInverse();
                // do only apply the size of the transform, rotation will
be realized by rotating the graphics,
                // otherwise the hp printers will not render the font
                Font derivedFont = awtFont.deriveFont(1f);
                g2d.setFont(derivedFont);

                GlyphVector glyphs = derivedFont.createGlyphVector(frc,
codeString);

                // apply the inverse transformation to the graphics, which
should be the same as applying the
                // transformation itself to the text
                g2d.transform(at);
                // translate the coordinates
                Point2D.Float newXy = new  Point2D.Float(x,y);
                atInv.transform(new Point2D.Float( x, y), newXy);
                g2d.drawGlyphVector(glyphs, (float)newXy.getX(),
(float)newXy.getY());

                // restore the original transformation
                g2d.transform(atInv);
            }
            catch (NoninvertibleTransformException e)
            {
                log.error("Error in " + getClass().getName() +
".writeFont", e);
            }
        }
        else
        {
            Font derivedFont = awtFont.deriveFont(at);
            g2d.setFont(derivedFont);

            GlyphVector glyphs = derivedFont.createGlyphVector(frc,
codeString);
            g2d.drawGlyphVector(glyphs, x, y);
        }

Well, that made everything work for the sample PDF that I was working on.
But then I realized that it is only because the "glyph" codes in the font
are equal to the codes used in the page stream.

For example, in a simple English PDF, there is no "toUnicode" table, and
the same character codes are used in the page stream. But the glyph codes
in the font are different.

In another PDF (which is RTL and uses connected characters) the code
sequence in the page stream start from 1 (like 1, 2, 3, 4, 5, 3, 6, ...)
but there is no "toUnicode" in it, and the glyph codes in the fonts are
different than those codes, and I didn't find any relation between the two.

After all, I don't know how can I decide when to use glyphs and when to use
the extracted text (string) to draw the characters. Or, is there a way to
convert everything to glyph codes and draw all the text using glyphs?

BTW, in your sample code to draw glyphs (quoted below) there's a
"CIDstring" which I didn't understand and I thought maybe it has something
to do with my current trouble.

Thanks in advance,
-Hamed


On Sat, Feb 18, 2012 at 10:58 PM, Andreas Lehmkuehler <andreas@lehmi.de>wrote:

> Hi,
>
> Am 18.02.2012 18:52, schrieb Hamed Iravanchi:
>
>  Hi again.
>>
>> Thanks for ur attention to the issue.
>> I actually checked,  and saw that the font itself (ttf stream) contains
>> the
>> correct cmap. If we can draw the text using glyph ID instead of
>> characters,  the font knows the right characters to draw.
>>
>> I checked the Font class instance in the debugger,  it contains a cmap
>> which is exactly right. First I was looking for ways to take the mapping
>> from the font (since it is private member,  specific to Sun impl).
>>
>> But I realized we could ask the font to draw glyphs instead of characters.
>> But i couldn't still find a right way to draw a glyph on graphics.
>>
> That's exactly what I'm doing. It somehow lokks like the following:
>
> Create the needed glyphs:
>
> FontRenderContext frc = new FontRenderContext(null, true, true);
> int stringLength = CIDstring.length();
> int[] codePoints = new int[stringLength];
> for (int i=0;i<stringLength;i++)
>   codePoints[i] = CIDstring.codePointAt(i);
> GlyphVector glyphs = awtFont.createGlyphVector(frc, codePoints);
>
> ...
>
> Draw the glyphs:
>
> g2d.drawGlyphVector(glyphs, x, y);
>
>
>  BTW,  I also can do the implementation and send u a patch once I realize
>> what to do. Thanks for ur encouragement :-)
>>
> Thanks for the offer, I'm already on that, I just have to clean up the
> code and to run some tests to avoid unwanted side effects.
> Once my code is available you might want to doublecheck it.
>
>
>  - Hamed
>>  On Feb 18, 2012 7:05 PM, "Andreas Lehmkuehler"<andreas@lehmi.de>  wrote:
>>
>>  <SNIP>
>
> BR
> Andreas Lehmkühler
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message