[ https://issues.apache.org/jira/browse/PDFBOX-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr closed PDFBOX-4275.
-----------------------------------
Resolution: Incomplete
Closing due to lack of feedback. You can still comment and/or reopen. PDFBox can extract diagonal
text. However it doesn't look good, for obvious reasons. There is no solution because each
glyph stands for itself, so only a human would know that this is part of something.
> Can't extract slanted text through the parsers of the PDFBox
> ------------------------------------------------------------
>
> Key: PDFBOX-4275
> URL: https://issues.apache.org/jira/browse/PDFBOX-4275
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing, Text extraction
> Affects Versions: 2.0.10
> Environment: I tested that in the overried showGlyph() method of my class extending
PDFStreamEngine, PDFGraphicsStreamEngine or PDFTextStripper.
> Reporter: Soocheon Kim
> Priority: Major
>
> The PDFBox (StreamEngine) extracts only texts that are rotated by 0, 90, 180 or -90
degrees.
> For example, it can't extract texts rotated by 45 or 60 degrees.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
|