pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Madlon-Kay (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-4304) Glyph Substitution Table lookup Cache doesn't clear by disabling a feature.
Date Wed, 05 Sep 2018 01:11:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603801#comment-16603801

Aaron Madlon-Kay commented on PDFBOX-4304:

As I recall, when I wrote the GSUB code I put the cache in only partly for performance; there
was also a correctness component to it.

(This is based on the last time I looked at the code, so I apologize if it's not quite accurate.)

When choosing a substitution, there is some dependence on context:

# The incoming Unicode characters's script influences the choice of LangSys which determines
the available features
# Many Unicode characters have ambiguous script ("Default") which means we have to consider
the surrounding text (it would be better to be able to set a default script/language for the
whole document, but such a setting doesn't exist at the moment)
# The place where the script is determined ({{GlyphSubstitutionTable.selectScriptTag}}) can't
see the actual surrounding text, so it can only guess based on the last known-valid script
# This means that without a cache to ensure consistency, an ambiguous-script character may
be substituted differently throughout the document
# (I'm fuzzy on this point:) I thought the way the Unicode map was created ({{PDCIDFontType2Embedder.buildToUnicodeCMap}})
there was some need for a one-to-one correspondence

{quote}I think I've now understood the first part. You asking that the cache be reset when
features are disabled or enabled, that makes sense, as long as getUnsubstitution() isn't used
"too late".{quote}

I agree that resetting the cache makes sense, but I am also wary that the font needs to be
stateless per the issues [~jahewson] found with my initial implementation. Unfortunately I
haven't had time to see what changes were made to remove the statefulness.

> Glyph Substitution Table lookup Cache doesn't clear by disabling a feature.
> ---------------------------------------------------------------------------
>                 Key: PDFBOX-4304
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4304
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.11
>            Reporter: Ali Safe
>            Priority: Major
>         Attachments: FDK_aban.ttf
> When I want to use GlyphSubstitutionTable to find the substituted gid for a specific
glyph that have 3 forms of substitutions, I found the same gid for each three forms.
> The font are a Persian font that have 3 substituted forms for some of it's glyphs. I
enabled the 'init', 'medi' and 'fina' features one by one and then disable them. But all of
these give me the same result.
> When I saw the GlyphSubstitutionTable class and getSubstitution(gid, scriptTags, enabledFeatures)
method in it, I saw a lookupCache that first check for gid only, and if the gid existed returns
the result, and if it's not in lookupCache do other parsing and calculations. I think every
time that some features are disabled or enabled, this cache must be cleared. And also the
cache lookup must be a mapping of three of the function input argument, because they are affect
the result of calculations. At least the lookupCache must be a mapping of gid and enabledFeatures.

> And when more than one feature are enabled, the lookup cache maps each gid to only one
substituted glyph, but in many languages there more than one substitutions form for some glyphs.
When I enable more than one features only the last enabled feature will be affected. 
> I used this code and attached the mentioned font file...
>                     // Persian Beh Letter with code 1576 in the font
>                     // Enable init feature
>                     ttf.enableGsubFeature("init");
>                     CmapLookup cMapLookupInit = ttf.getUnicodeCmapLookup();
>                     int glyphIdInit = cMapLookupInit.getGlyphId(1576);
>                     ttf.disableGsubFeature("init");
>                     // Enable medi feature
>                     ttf.enableGsubFeature("medi");
>                     CmapLookup cMapLookupMedi = ttf.getUnicodeCmapLookup();
>                     int glyphIdMedi = cMapLookupMedi.getGlyphId(1576);
>                     ttf.disableGsubFeature("medi");
>                     // Now the glypIdMedi and glyphIdInit have same values...

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org

View raw message