pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Tomer <sc...@tomer.cc>
Subject International characters only show correctly when form field is selected
Date Thu, 08 Mar 2018 19:42:06 GMT

I’m new to the list, but tried to search pdfbox-users.markmail.org <http://pdfbox-users.markmail.org/>
before asking with no luck.

We are using pdfbox to fill in some form fields in an Adobe generated template but getting
odd results when certain international characters are used (some, not all).  When the pdf
is first opened, the characters shown are basically garbage.  Here is an example: þÿB D

However, when you click into the field (or in certain readers like Okular on Linux choose
“Show Forms”), the correct characters are shown.  Here is what is inserted into the field
and shown when field is selected: ł ń Ł ó ź

It is almost like the PDF has one font selected for a read-only view and the correct font
for the view when editing a field.

This is happening with Polish, Russian, Chinese and other languages.

This is how I am populating the fields:

PDDocument pdfDoc = LoadPDF.load(cs, document);
PDDocumentCatalog docCatalog = pdfDoc.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();

if (acroForm != null) {
	for (PDField field : acroForm.getFieldTree()) {
		for (PdfField pdfField : pdfFields) {
			if (field.getPartialName() != null && field.getPartialName().equalsIgnoreCase(pdfField.getName()))

Thanks for any help,
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message