pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colette Joubarne <cjouba...@privacyanalytics.ca>
Subject RE: Unable to mark document as tagged
Date Fri, 13 Jun 2014 15:48:24 GMT
Maruan,

I am copying the entire structure from a tagged document and just replacing some of the text,
so I would think that the structure is unchanged. Then again who knows what I might have messed
up.

James-pdf is the original file:
https://dl.dropboxusercontent.com/u/7689859/James.pdf

James-mod.pdf is the modified file:
https://dl.dropboxusercontent.com/u/7689859/James-mod.pdf

Colette

-----Original Message-----
From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
Sent: June-13-14 10:45 AM
To: users@pdfbox.apache.org
Subject: Re: Unable to mark document as tagged

Hi Colette,

this information alone doesn't make a document a tagged PDF! You might not have the structure
information needed within your PDF. Would you have a works / doesn't work sample which you
could upload to a public location as attachments are not allowed on the mailing list?

BR
Maruan

Am 13.06.2014 um 15:44 schrieb Colette Joubarne <cjoubarne@privacyanalytics.ca>:

> Maruan,
> 
> Yes you are right, however why is it that when I look at the properties in Adobe Reader
it indicates that the document is not tagged?
> 
> 3 0 obj
> <<
> /Marked true
>>> 
> 
> Colette
> -----Original Message-----
> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
> Sent: June-13-14 9:19 AM
> To: users@pdfbox.apache.org
> Subject: Re: Unable to mark document as tagged
> 
> Dear Colette,
> 
> /MarkInfo 3 0 R indicates that the information you are looking for is referenced and
should be available in 3 0 obj. Could you verify that?
> 
> With kind regards
> 
> Maruan
> 
> Am 13.06.2014 um 14:21 schrieb Colette Joubarne <cjoubarne@privacyanalytics.ca>:
> 
>> I have a tagged pdf doc with the following header:
>> 
>>           /Type/Catalog/Pages 2 0 R/Lang(en-CA) /StructTreeRoot 10 0 R/MarkInfo<</Marked
true
>> 
>> I read in the contents, replace some of the text and create a new doc. I copy the
document information from the original doc and set marked to true.
>> 
>>           newDoc = new PDDocument();
>>           newDoc.setDocumentInformation(PTConstants.pdfDoc.getDocumentInformation());
>> 
>>           PDMarkInfo markinfo = new PDMarkInfo();
>>           markinfo.setMarked(true);
>>           newDoc.getDocumentCatalog().setMarkInfo(markinfo);
>> 
>> and when I check that it was set, it returns true:
>> 
>>     PDMarkInfo markInfo = PTConstants.pdfDoc.getDocumentCatalog().getMarkInfo();
>>     if ((markInfo != null) && (markInfo.isMarked())) System.out.println("true");
>> 
>> But, while the resulting document displays correctly, the header indicates that it
is not tagged:
>> 
>> /Type /Catalog
>> /Version /1.4
>> /Pages 2 0 R
>> /MarkInfo 3 0 R
>> 
>> Any idea what is going on?
>> 
>> Colette
> 


Mime
View raw message