pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lukas Baab" <19lu...@web.de>
Subject use embedded fonts to write text
Date Thu, 07 Mar 2013 09:23:10 GMT

Hi!

I want to read the text of a pdf and just write it again on the page.

In theory this is simple: Use the PDFStreamEngine to get all TextPositions of a page. The
TextPosition has everything you need to write the text with the same font... at the right
place. Code see below. Complete example see attachment.

Unfortunately it is not that easy: Whether this solution works or not depends on the font
of the text and how the text is embedded into the pdf.

Questions:
What type of font/type of font-embedding are supported by PdfBox? (What type is supported
to reuse in the pdf?)
Do I have to handle different embedded fonts differently? How?
How can I check whether I can write some text with a font or not?

I appreciate every kind of advice and answer!

Thanks
Lukas



Attachment:
Code of TextReprintExample
exampleFiles:
  example 1: created with LibreOffice, the whole text is reprinted with wrong characters
  example 2: created with Word, the text is reprinted correctly, but special characters (
„ and “ ) are not reprinted



Here the code:

public void reprintTextTest() throws Exception {
  PDDocument document = PDDocument.load("E:/80_tmp/test.pdf");
  List<PDPage> allPages = document.getDocumentCatalog().getAllPages();

  for (PDPage page : allPages) {
    List<TextPosition> textPositionsOfPage = getTextPosition(page);
    writeText(document, page, textPositionsOfPage);
  }

  document.save("E:/80_tmp/test-result.pdf");
  document.close();
}

private void writeText(PDDocument document, PDPage page, List<TextPosition> textPositions)
throws IOException {
  float pageHeight = page.findMediaBox().getHeight();
  PDPageContentStream pageContentStream = new PDPageContentStream(document, page, true, true);
  pageContentStream.setNonStrokingColor(Color.GREEN);

  for (TextPosition textPosition : textPositions) {
    float x = textPosition.getX();
    float y = pageHeight - textPosition.getY();
    pageContentStream.beginText();
    pageContentStream.moveTextPositionByAmount(x, y);
    pageContentStream.setFont(textPosition.getFont(), textPosition.getFontSize());
    pageContentStream.drawString(textPosition.getCharacter());
    pageContentStream.endText();
  }

  pageContentStream.close();
}
Mime
View raw message