pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkuehler <andr...@lehmi.de>
Subject Re: Help needed to resolve issue with converting Arabic characters to presentation forms
Date Thu, 01 Mar 2012 19:12:21 GMT
Hi,

Am 29.02.2012 09:49, schrieb Hamed Iravanchi:
> Hi Andreas,
>
> Regarding the glyph-drawing issue, since I didn't hear anything from you I
> decided to take a shot myself, so I checked out the code (1.6 release tag)
> and started modifying it to see if I can get the result I expect, but I am
> confused and need help :)
Sorry, but I hadn't any free cycles in the last week ....

> I managed to convert the sample PDF that I provided to image correctly, but
> I made almost everything else corrupt! Here's what I did:
>
> I added a "drawGlyph" to PDFont, next to "drawString" like this:
>
>      public abstract void drawString( String string, Graphics g, float
> fontSize,
>          AffineTransform at, float x, float y ) throws IOException;
>
>      public abstract void drawGlyph(int[] codeString, Graphics g, float
> fontSize,
>                                     AffineTransform at, float x, float y)
> throws IOException;
>
> I tried to use the codes extracted from page stream. In the PDFStreamEngine
> ->  processEncodedText ->  for loop ->  when "font.encode" succeeds, I use the
> same code integer to draw glyphs, and I passed it along the string to
> "processTextPosition" and I called "drawGlyph" in it, instead of
> "drawString".
>
> Here's the drawGlyph code that I wrote, according to your guidance:
>
>      @Override
>      public void drawGlyph(int[] codeString, Graphics g, float fontSize,
> AffineTransform at, float x, float y)
>              throws IOException
>      {
>          Font _awtFont = getawtFont();
>          Graphics2D g2d = (Graphics2D)g;
>          g2d.setRenderingHint(RenderingHints.KEY_ANTIALIASING,
> RenderingHints.VALUE_ANTIALIAS_ON);
>          writeFont(g2d, at, _awtFont, x, y, codeString);
>      }
>
>
> Which uses an overload of writeFont similar to the original:
>
>
>      protected void writeFont(final Graphics2D g2d, final AffineTransform
> at, final Font awtFont,
>                               final float x, final float y, final int[]
> codeString)
>      {
>          FontRenderContext frc = new FontRenderContext(null, true, true);
>
>          // check if we have a rotation
>          if (!at.isIdentity())
>          {
>              try
>              {
>                  AffineTransform atInv = at.createInverse();
>                  // do only apply the size of the transform, rotation will
> be realized by rotating the graphics,
>                  // otherwise the hp printers will not render the font
>                  Font derivedFont = awtFont.deriveFont(1f);
>                  g2d.setFont(derivedFont);
>
>                  GlyphVector glyphs = derivedFont.createGlyphVector(frc,
> codeString);
>
>                  // apply the inverse transformation to the graphics, which
> should be the same as applying the
>                  // transformation itself to the text
>                  g2d.transform(at);
>                  // translate the coordinates
>                  Point2D.Float newXy = new  Point2D.Float(x,y);
>                  atInv.transform(new Point2D.Float( x, y), newXy);
>                  g2d.drawGlyphVector(glyphs, (float)newXy.getX(),
> (float)newXy.getY());
>
>                  // restore the original transformation
>                  g2d.transform(atInv);
>              }
>              catch (NoninvertibleTransformException e)
>              {
>                  log.error("Error in " + getClass().getName() +
> ".writeFont", e);
>              }
>          }
>          else
>          {
>              Font derivedFont = awtFont.deriveFont(at);
>              g2d.setFont(derivedFont);
>
>              GlyphVector glyphs = derivedFont.createGlyphVector(frc,
> codeString);
>              g2d.drawGlyphVector(glyphs, x, y);
>          }
>
> Well, that made everything work for the sample PDF that I was working on.
> But then I realized that it is only because the "glyph" codes in the font
> are equal to the codes used in the page stream.
>
> For example, in a simple English PDF, there is no "toUnicode" table, and
> the same character codes are used in the page stream. But the glyph codes
> in the font are different.
>
> In another PDF (which is RTL and uses connected characters) the code
> sequence in the page stream start from 1 (like 1, 2, 3, 4, 5, 3, 6, ...)
> but there is no "toUnicode" in it, and the glyph codes in the fonts are
> different than those codes, and I didn't find any relation between the two.
>
> After all, I don't know how can I decide when to use glyphs and when to use
> the extracted text (string) to draw the characters. Or, is there a way to
> convert everything to glyph codes and draw all the text using glyphs?
There are a lot of different ways to encode the text/glyph mapping as you 
already found out. ;-) I'm afraid it's too much to write it down here.

I'm almost done, but I have to get rid of some unwanted side-effects. I hope to 
find some time at the weekend to finish my work.

> BTW, in your sample code to draw glyphs (quoted below) there's a
> "CIDstring" which I didn't understand and I thought maybe it has something
> to do with my current trouble.
The CIDstring in my example contains the codes for the glyphs and not the 
readable text.

> Thanks in advance,
> -Hamed
>
>
> On Sat, Feb 18, 2012 at 10:58 PM, Andreas Lehmkuehler<andreas@lehmi.de>wrote:
>
>> Hi,
>>
>> Am 18.02.2012 18:52, schrieb Hamed Iravanchi:
>>
>>   Hi again.
>>>
>>> Thanks for ur attention to the issue.
>>> I actually checked,  and saw that the font itself (ttf stream) contains
>>> the
>>> correct cmap. If we can draw the text using glyph ID instead of
>>> characters,  the font knows the right characters to draw.
>>>
>>> I checked the Font class instance in the debugger,  it contains a cmap
>>> which is exactly right. First I was looking for ways to take the mapping
>>> from the font (since it is private member,  specific to Sun impl).
>>>
>>> But I realized we could ask the font to draw glyphs instead of characters.
>>> But i couldn't still find a right way to draw a glyph on graphics.
>>>
>> That's exactly what I'm doing. It somehow lokks like the following:
>>
>> Create the needed glyphs:
>>
>> FontRenderContext frc = new FontRenderContext(null, true, true);
>> int stringLength = CIDstring.length();
>> int[] codePoints = new int[stringLength];
>> for (int i=0;i<stringLength;i++)
>>    codePoints[i] = CIDstring.codePointAt(i);
>> GlyphVector glyphs = awtFont.createGlyphVector(frc, codePoints);
>>
>> ...
>>
>> Draw the glyphs:
>>
>> g2d.drawGlyphVector(glyphs, x, y);
>>
>>
>>   BTW,  I also can do the implementation and send u a patch once I realize
>>> what to do. Thanks for ur encouragement :-)
>>>
>> Thanks for the offer, I'm already on that, I just have to clean up the
>> code and to run some tests to avoid unwanted side effects.
>> Once my code is available you might want to doublecheck it.
>>
>>
>>   - Hamed
>>>   On Feb 18, 2012 7:05 PM, "Andreas Lehmkuehler"<andreas@lehmi.de>   wrote:
>>>
>>>   <SNIP>
>>
>> BR
>> Andreas Lehmkühler

BR
Andreas Lehmkühler


Mime
View raw message