pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler (JIRA) <j...@apache.org>
Subject [jira] Commented: (PDFBOX-400) TextExtractor do not extract complete text
Date Sat, 10 Jan 2009 09:59:59 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662646#action_12662646
] 

Andreas Lehmkühler commented on PDFBOX-400:
-------------------------------------------

Did you ever try the upcoming version 0.8? 
Do you get an error message or is just the mentioned part of the text missing?
Please attach a sample document to this issue if possible.

> TextExtractor do not extract complete text
> ------------------------------------------
>
>                 Key: PDFBOX-400
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-400
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 0.7.3
>         Environment: Win Xp Professional SP2
>            Reporter: prashant dhaka
>            Priority: Critical
>
> Hi , 
> Need your help and advice.. 
>  
> when we extract text from pdf acrylic text is not extracted.. 
> Example: 
>  
> TEXT In PDF 
> "Th" of "The" is acrylic 
>  
> The remaining text  
>  
> Extracted Text: 
> e remaining text  
>  
> Th is missing 
>  
> Note: Acrobat extract the complete text. 
>  
> Plz provide your suggestion to resolve this issue. 
>  
> Thnx 
> Prashant 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message