pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CDB <cbu...@burkeitconsulting.com>
Subject Re: COSString Selection
Date Mon, 11 Jun 2012 21:46:41 GMT
Is this mailing list active?



On 6/8/12 1:05 PM, "CDB" <cburke@burkeitconsulting.com> wrote:

>I am having issues using the getText method (ExtractText) functions as it
>cats all text together.
>I would like to go a step deeper and pull each COSString value and delimit
>them.
>Below is the code I am using thus far to get all text.
>
>I am not 
>        try {
>
> 
>
>           PDFTextStripper pdfTextStripper = new PDFTextStripper();
>
>            doc = PDDocument.load( stream );
>
>
>
>          return (pdfTextStripper.getText(doc));
>
>          
>
>
>
>        } finally {
>
>            quietlyClose(doc);
>
>        }
>
>
>I noticed that the the logs show the operators and types.  But some
>strings
>are broken up into multiple COSString fields within arrays.
>
>I would like to know what methods can I use to traverse/look all fields
>and
>select the COStrings out.
>
>Thanks
>
>



Mime
View raw message