pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: Problem reading PDF with Type1 font
Date Sat, 16 Jul 2016 02:47:49 GMT

> On 14 Jul 2016, at 00:10, Thomas Letsch <contact@thomas-letsch.de> wrote:
> 
> Hi John,
> 
> ok, got most of it. Saving the whole bytes doesn't seem to be easy in intellij, but I
could save the first 300 bytes and apparently we got the error on pos=223.

Ok, it’s unconventional, but effective. Here’s what those bytes decode to:

%!FontType1-1.0: ARTWAB+Helvetica 1.0
10 dict begin
/FontName /ARTWAB+Helvetica def
/PaintType 0 def
/FontType 1 def
/FontMatrix [0.001 0 0 0.001 0 0] readonly def
/FontBBox [0 -221 932 886] readonly def
/Encoding [/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
/.notdef /.no

Position 223 is the end of the first /.notdef but there’s a one-token lookahead, so the
error actually relates to the token before which is a left square bracket, indicating the
start of an array. That matches the error:

> PDType1Font [ERROR] Can't read the embedded Type1 font <name removed>
> java.io.IOException: Found Token[kind=START_ARRAY, text=[] but expected
> INTEGER


The encoding array in a Type 1 font usually follows a different format:

/Encoding 256 array
dup 32 /space put
...
dup 254 /bracerightbt put
readonly def

Quite different. Now PostScript does allow arrays to be written more simply as [foo, bar,
…] and that’s used in Type 1 fonts, just not for the /Encoding array.

So is this legal or not? I’m not sure. I’d like to see what the entire encoding array
looks like first. Can you save the entire ASCII position to a file? With Java 7, just paste
this one-liner into IntelliJ’s watch window once you hit the breakpoint in parseASCII:


Files.write(Paths.get("font.txt"), bytes);
— John

> 
> Attached also screenshots from the debugger. If you need more, just tell me.
> 
> Thanks for your help,
> Thomas
> 
> 
> The first bytes in the buffer:
> 0 = 37
> 1 = 33
> 2 = 70
> 3 = 111
> 4 = 110
> 5 = 116
> 6 = 84
> 7 = 121
> 8 = 112
> 9 = 101
> 10 = 49
> 11 = 45
> 12 = 49
> 13 = 46
> 14 = 48
> 15 = 58
> 16 = 32
> 17 = 65
> 18 = 82
> 19 = 84
> 20 = 87
> 21 = 65
> 22 = 66
> 23 = 43
> 24 = 72
> 25 = 101
> 26 = 108
> 27 = 118
> 28 = 101
> 29 = 116
> 30 = 105
> 31 = 99
> 32 = 97
> 33 = 32
> 34 = 49
> 35 = 46
> 36 = 48
> 37 = 10
> 38 = 49
> 39 = 48
> 40 = 32
> 41 = 100
> 42 = 105
> 43 = 99
> 44 = 116
> 45 = 32
> 46 = 98
> 47 = 101
> 48 = 103
> 49 = 105
> 50 = 110
> 51 = 10
> 52 = 47
> 53 = 70
> 54 = 111
> 55 = 110
> 56 = 116
> 57 = 78
> 58 = 97
> 59 = 109
> 60 = 101
> 61 = 32
> 62 = 47
> 63 = 65
> 64 = 82
> 65 = 84
> 66 = 87
> 67 = 65
> 68 = 66
> 69 = 43
> 70 = 72
> 71 = 101
> 72 = 108
> 73 = 118
> 74 = 101
> 75 = 116
> 76 = 105
> 77 = 99
> 78 = 97
> 79 = 32
> 80 = 100
> 81 = 101
> 82 = 102
> 83 = 10
> 84 = 47
> 85 = 80
> 86 = 97
> 87 = 105
> 88 = 110
> 89 = 116
> 90 = 84
> 91 = 121
> 92 = 112
> 93 = 101
> 94 = 32
> 95 = 48
> 96 = 32
> 97 = 100
> 98 = 101
> 99 = 102
> 100 = 10
> 101 = 47
> 102 = 70
> 103 = 111
> 104 = 110
> 105 = 116
> 106 = 84
> 107 = 121
> 108 = 112
> 109 = 101
> 110 = 32
> 111 = 49
> 112 = 32
> 113 = 100
> 114 = 101
> 115 = 102
> 116 = 10
> 117 = 47
> 118 = 70
> 119 = 111
> 120 = 110
> 121 = 116
> 122 = 77
> 123 = 97
> 124 = 116
> 125 = 114
> 126 = 105
> 127 = 120
> 128 = 32
> 129 = 91
> 130 = 48
> 131 = 46
> 132 = 48
> 133 = 48
> 134 = 49
> 135 = 32
> 136 = 48
> 137 = 32
> 138 = 48
> 139 = 32
> 140 = 48
> 141 = 46
> 142 = 48
> 143 = 48
> 144 = 49
> 145 = 32
> 146 = 48
> 147 = 32
> 148 = 48
> 149 = 93
> 150 = 32
> 151 = 114
> 152 = 101
> 153 = 97
> 154 = 100
> 155 = 111
> 156 = 110
> 157 = 108
> 158 = 121
> 159 = 32
> 160 = 100
> 161 = 101
> 162 = 102
> 163 = 10
> 164 = 47
> 165 = 70
> 166 = 111
> 167 = 110
> 168 = 116
> 169 = 66
> 170 = 66
> 171 = 111
> 172 = 120
> 173 = 32
> 174 = 91
> 175 = 48
> 176 = 32
> 177 = 45
> 178 = 50
> 179 = 50
> 180 = 49
> 181 = 32
> 182 = 57
> 183 = 51
> 184 = 50
> 185 = 32
> 186 = 56
> 187 = 56
> 188 = 54
> 189 = 93
> 190 = 32
> 191 = 114
> 192 = 101
> 193 = 97
> 194 = 100
> 195 = 111
> 196 = 110
> 197 = 108
> 198 = 121
> 199 = 32
> 200 = 100
> 201 = 101
> 202 = 102
> 203 = 10
> 204 = 47
> 205 = 69
> 206 = 110
> 207 = 99
> 208 = 111
> 209 = 100
> 210 = 105
> 211 = 110
> 212 = 103
> 213 = 32
> 214 = 91
> 215 = 47
> 216 = 46
> 217 = 110
> 218 = 111
> 219 = 116
> 220 = 100
> 221 = 101
> 222 = 102
> 223 = 32
> 224 = 47
> 225 = 46
> 226 = 110
> 227 = 111
> 228 = 116
> 229 = 100
> 230 = 101
> 231 = 102
> 232 = 32
> 233 = 47
> 234 = 46
> 235 = 110
> 236 = 111
> 237 = 116
> 238 = 100
> 239 = 101
> 240 = 102
> 241 = 32
> 242 = 47
> 243 = 46
> 244 = 110
> 245 = 111
> 246 = 116
> 247 = 100
> 248 = 101
> 249 = 102
> 250 = 32
> 251 = 47
> 252 = 46
> 253 = 110
> 254 = 111
> 255 = 116
> 256 = 100
> 257 = 101
> 258 = 102
> 259 = 32
> 260 = 47
> 261 = 46
> 262 = 110
> 263 = 111
> 264 = 116
> 265 = 100
> 266 = 101
> 267 = 102
> 268 = 32
> 269 = 47
> 270 = 46
> 271 = 110
> 272 = 111
> 273 = 116
> 274 = 100
> 275 = 101
> 276 = 102
> 277 = 32
> 278 = 47
> 279 = 46
> 280 = 110
> 281 = 111
> 282 = 116
> 283 = 100
> 284 = 101
> 285 = 102
> 286 = 10
> 287 = 47
> 288 = 46
> 289 = 110
> 290 = 111
> 291 = 116
> 292 = 100
> 293 = 101
> 294 = 102
> 295 = 32
> 296 = 47
> 297 = 46
> 298 = 110
> 299 = 111
> 
> 
> 
> 
> Am 14.07.2016 um 01:17 schrieb John Hewson:
>>> On 13 Jul 2016, at 02:16, Thomas Letsch <contact@thomas-letsch.de> <mailto:contact@thomas-letsch.de>
wrote:
>>> 
>>> Hi John,
>>> 
>>> I am really sorry, but I spoke with my colleges and we may not give out
>>> even the fonts.
>>> 
>>> I could try and debug it on my machine if there are some hints about
>>> what to check.
>> Launch with a debugger attached and when the exception is thrown, find
>> the ”parseASCII" method in the call stack and save the "bytes” array to a .txt
>> file. This contains only the ASCII portion of the font, which you can easily read
>> by eye and see that it contains no sensitive information. Once you’ve confirmed
>> that, send us the .txt file.
>> 
>> — John
>> 
>>> Thanks for your help,
>>> Thomas
>>> 
>>> Am 13.07.2016 um 04:44 schrieb John Hewson:
>>>> You are right, probably me being too strict. Its called ARTWAB+Helvetica.
>>>> That’s a subset. So we’re going to need that actual font file. You can
extract it from the PDF
>>>> using our GUI-baed PDFDebugger . Navigate to the page in question, and find
the Font 
>>>> resource with that name. Right-click on the FontFile resource in the tree
and save the
>>>> stream to a .pfb file. Then send us that file.
>>>> 
>>>> — John
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org <mailto:users-unsubscribe@pdfbox.apache.org>
>>> For additional commands, e-mail: users-help@pdfbox.apache.org <mailto:users-help@pdfbox.apache.org>
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org <mailto:users-unsubscribe@pdfbox.apache.org>
>> For additional commands, e-mail: users-help@pdfbox.apache.org <mailto:users-help@pdfbox.apache.org>
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message