Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 286FB200D46 for ; Sun, 26 Nov 2017 12:01:48 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 26E7E160BFF; Sun, 26 Nov 2017 11:01:48 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EAE44160BFA for ; Sun, 26 Nov 2017 12:01:45 +0100 (CET) Received: (qmail 80259 invoked by uid 500); 26 Nov 2017 11:01:45 -0000 Mailing-List: contact commits-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pdfbox.apache.org Delivered-To: mailing list commits@pdfbox.apache.org Received: (qmail 80250 invoked by uid 99); 26 Nov 2017 11:01:45 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Nov 2017 11:01:45 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id F0150E0779; Sun, 26 Nov 2017 11:01:44 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: lehmi@apache.org To: commits@pdfbox.apache.org Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: pdfbox-docs git commit: Site checkin for project Apache PDFBox Website Date: Sun, 26 Nov 2017 11:01:44 +0000 (UTC) archived-at: Sun, 26 Nov 2017 11:01:48 -0000 Repository: pdfbox-docs Updated Branches: refs/heads/asf-site 4eacbca65 -> 63cbc6cc6 Site checkin for project Apache PDFBox Website Project: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/repo Commit: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/commit/63cbc6cc Tree: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/tree/63cbc6cc Diff: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/diff/63cbc6cc Branch: refs/heads/asf-site Commit: 63cbc6cc68ca17134e8c1031482160e537fcf16e Parents: 4eacbca Author: Andreas Lehmkühler Authored: Sun Nov 26 12:01:42 2017 +0100 Committer: Andreas Lehmkühler Committed: Sun Nov 26 12:01:42 2017 +0100 ---------------------------------------------------------------------- content/1.8/architecture.html | 15 +++--- content/1.8/cookbook/documentcreation.html | 10 ++-- content/1.8/cookbook/encryption.html | 5 +- content/1.8/cookbook/fill-form-field.html | 25 ++++++---- content/1.8/cookbook/pdfacreation.html | 20 ++++---- content/1.8/cookbook/pdfavalidation.html | 5 +- content/1.8/cookbook/rendering.html | 5 +- content/1.8/cookbook/textextraction.html | 10 ++-- .../1.8/cookbook/workingwithattachments.html | 5 +- content/1.8/cookbook/workingwithfonts.html | 15 +++--- content/1.8/cookbook/workingwithmetadata.html | 15 +++--- content/1.8/dependencies.html | 15 +++--- content/1.8/faq.html | 20 ++++---- content/2.0/cookbook/encryption.html | 5 +- content/2.0/dependencies.html | 15 +++--- content/2.0/faq.html | 20 ++++---- content/2.0/getting-started.html | 5 +- content/2.0/migration.html | 50 ++++++++++++-------- content/building.html | 15 +++--- content/codingconventions.html | 5 +- content/doap_PDFBox.rdf | 14 ++++++ content/siteupdate.html | 40 +++++++++------- 22 files changed, 206 insertions(+), 128 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/architecture.html ---------------------------------------------------------------------- diff --git a/content/1.8/architecture.html b/content/1.8/architecture.html index 363b5c2..aee21ee 100644 --- a/content/1.8/architecture.html +++ b/content/1.8/architecture.html @@ -272,19 +272,21 @@ doesn’t provide the functionality needed.

A page in a PDF document is represented with a COSDictionary. The entries that are available for a page can be seen in the PDF Reference and an example of a page looks like this:

-
<<
+
<<
     /Type /Page
     /MediaBox [0 0 612 915]
     /Contents 56 0 R
 >>
-
+ +

The information within the dictionary can be accessed using the COS model

-
COSDictionary page = ...;
+
COSDictionary page = ...;
 COSArray mediaBox = (COSArray)page.getDictionaryObject( "MediaBox" );
 System.out.println( "Width:" + mediaBox.get( 3 ) );
-
+ +

As can be seen from that little example the COS model provides a low level API to access information within the PDF. In order to use the COS model successfully a good knowledge of @@ -302,10 +304,11 @@ available to access the attributes.

The same code from above to get the page width can be rewritten to use PD Model classes.

-
PDPage page = ...;
+
PDPage page = ...;
 PDRectangle mediaBox = page.getMediaBox();
 System.out.println( "Width:" + mediaBox.getWidth() );
-
+ +

PD Model objects sit on top of COS model. Typically, the classes in the PD Model will only store a COS object and all setter/getter methods will modify data that is stored in the http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/documentcreation.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/documentcreation.html b/content/1.8/cookbook/documentcreation.html index 17041e0..1a5e398 100644 --- a/content/1.8/cookbook/documentcreation.html +++ b/content/1.8/cookbook/documentcreation.html @@ -162,7 +162,7 @@

This small sample shows how to create a new PDF document using PDFBox.

-
// Create a new empty document
+
// Create a new empty document
 PDDocument document = new PDDocument();
 
 // Create a new blank page and add it to the document
@@ -175,13 +175,14 @@
 // finally make sure that the document is properly
 // closed.
 document.close();
-
+ +

Hello World Using a PDF Base Font

This small sample shows how to create a new document and print the text “Hello World” using one of the PDF base fonts.

-
// Create a document and add a page to it
+
// Create a document and add a page to it
 PDDocument document = new PDDocument();
 PDPage page = new PDPage();
 document.addPage( page );
@@ -205,7 +206,8 @@
 // Save the results and ensure that the document is properly closed:
 document.save( "Hello World.pdf");
 document.close();
-
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/encryption.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/encryption.html b/content/1.8/cookbook/encryption.html index 43ae771..f17541f 100644 --- a/content/1.8/cookbook/encryption.html +++ b/content/1.8/cookbook/encryption.html @@ -164,7 +164,7 @@

This small sample shows how to encrypt a file so that it can be viewed, but not printed.

-
PDDocument doc = PDDocument.load("filename.pdf");
+
PDDocument doc = PDDocument.load("filename.pdf");
 
 // Define the length of the encryption key.
 // Possible values are 40 or 128 (256 will be available in PDFBox 2.0).
@@ -184,7 +184,8 @@
 
 doc.save("filename-encrypted.pdf");
 doc.close();
-
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/fill-form-field.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/fill-form-field.html b/content/1.8/cookbook/fill-form-field.html index fff33dc..1937e1b 100644 --- a/content/1.8/cookbook/fill-form-field.html +++ b/content/1.8/cookbook/fill-form-field.html @@ -164,48 +164,53 @@ be neccessary to walk through the tree to get an individual field.

Load the PDF document.

-
:::java
+
:::java
 // load the document
 PDDocument pdfDocument = PDDocument.loadNonSeq(new File(... ), null);
-
+ +

Get the docoument catalog and the AcroForm which might be contained within.

-
:::java
+
:::java
 // get the document catalog
 PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
 PDAcroForm acroForm = docCatalog.getAcroForm();
-
+ +

Retrieve an individual field and set its value.

-
:::java
+
:::java
 // as there might not be an AcroForm entry a null check is necessary
 if (acroForm != null)
 {
     PDField field = (PDField) acroForm.getField( "fieldName" );
     field.setValue("new field value");
 }
-
+ +

If a field is nested within the form tree a fully qualified name might be provided to access the field.

-
:::java
+
:::java
 // as there might not be an AcroForm entry a null check is neccessary
 if (acroForm != null)
 {
     PDField field = (PDField) acroForm.getField( "fieldsParentName.fieldName" );
     field.setValue("new field value");
 }
-
+ +

Save and close the filled out form.

-
:::java
+
:::java
 doc.save(filledForm);
 doc.close();
-
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/pdfacreation.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/pdfacreation.html b/content/1.8/cookbook/pdfacreation.html index 63650e0..d914393 100644 --- a/content/1.8/cookbook/pdfacreation.html +++ b/content/1.8/cookbook/pdfacreation.html @@ -169,9 +169,10 @@ document. The current example creates a valid PDF/A-1b document.

The PDF/A specification enforces that the fonts used in the document are present in the PDF File. You have to load them. As an example:

-
InputStream fontStream = CreatePDFA.class.getResourceAsStream("/org/apache/pdfbox/resources/ttf/ArialMT.ttf");
+
InputStream fontStream = CreatePDFA.class.getResourceAsStream("/org/apache/pdfbox/resources/ttf/ArialMT.ttf");
 PDFont font = PDTrueTypeFont.loadTTF(doc, fontStream);
-
+ +

Include XMP Metadata Block

@@ -179,21 +180,22 @@ have to load them. As an example:

of PDF/A specification reached by the document) must be present. These lines create the XMP metadata for a PDF/A-1b document:

-
XMPMetadata xmp = new XMPMetadata();
+
XMPMetadata xmp = new XMPMetadata();
 XMPSchemaPDFAId pdfaid = new XMPSchemaPDFAId(xmp);
 xmp.addSchema(pdfaid);
 pdfaid.setConformance("B");
 pdfaid.setPart(1);
 pdfaid.setAbout("");
 metadata.importXMPMetadata(xmp);
-
+ +

Include Color Profile

It is mandatory to include the color profile used by the document. Different profiles can be used. This example takes one present in pdfbox:

-
// Create output intent
+
// Create output intent
 InputStream colorProfile = CreatePDFA.class.getResourceAsStream("/org/apache/pdfbox/resources/pdfa/sRGB Color Space Profile.icm");
 PDOutputIntent oi = new PDOutputIntent(doc, colorProfile); 
 oi.setInfo("sRGB IEC61966-2.1"); 
@@ -201,14 +203,16 @@ example takes one present in pdfbox:

oi.setOutputConditionIdentifier("sRGB IEC61966-2.1"); oi.setRegistryName("http://www.color.org"); cat.addOutputIntent(oi); -
+ +

Complete example

The complete example can be found in pdfbox-example. The source file is

-
src/main/java/org/apache/pdfbox/examples/pdfa/CreatePDFA.java
-
+
src/main/java/org/apache/pdfbox/examples/pdfa/CreatePDFA.java
+
+
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/pdfavalidation.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/pdfavalidation.html b/content/1.8/cookbook/pdfavalidation.html index 0a7a762..7bdb3d7 100644 --- a/content/1.8/cookbook/pdfavalidation.html +++ b/content/1.8/cookbook/pdfavalidation.html @@ -163,7 +163,7 @@ Check Compliance with PDF/A-1b

This small sample shows how to check the compliance of a file with the PDF/A-1b specification.

-
ValidationResult result = null;
+
ValidationResult result = null;
 
 PreflightParser parser = new PreflightParser(args[0]);
 try
@@ -210,7 +210,8 @@ Check Compliance with PDF/A-1b

System.out.println(error.getErrorCode() + " : " + error.getDetails()); } } -
+ +

Categories of Validation Error

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/rendering.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/rendering.html b/content/1.8/cookbook/rendering.html index bf8b039..710a27b 100644 --- a/content/1.8/cookbook/rendering.html +++ b/content/1.8/cookbook/rendering.html @@ -162,7 +162,7 @@

This small sample shows how to render (convert to images) a PDF document using PDFBox.

-
:::java
+
:::java
     String filename = "YOURFILENAMEHERE.pdf";
 
     // open the document
@@ -196,7 +196,8 @@
     }
 
     doc.close();
-
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/textextraction.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/textextraction.html b/content/1.8/cookbook/textextraction.html index 0fe7a85..772c5ab 100644 --- a/content/1.8/cookbook/textextraction.html +++ b/content/1.8/cookbook/textextraction.html @@ -175,8 +175,9 @@ org.apache.pdfbox.ExtractText.

Lucene to be able to index a PDF document it must first be converted to text. PDFBox provides a simple approach for adding PDF documents into a Lucene index.

-
Document luceneDocument = LucenePDFDocument.getDocument( ... );
-
+
Document luceneDocument = LucenePDFDocument.getDocument( ... );
+
+

Now that you hava a Lucene Document object, you can add it to the Lucene index just like you would if it had been created from a text or HTML file. The LucenePDFDocument automatically @@ -199,11 +200,12 @@ process. The simplest is to specify the range of pages that you want to be extra For example, to only extract text from the second and third pages of the PDF document you could do this:

-
PDFTextStripper stripper = new PDFTextStripper();
+
PDFTextStripper stripper = new PDFTextStripper();
 stripper.setStartPage( 2 );
 stripper.setEndPage( 3 );
 stripper.writeText( ... );
-
+ +

NOTE: The startPage and endPage properties of PDFTextStripper are 1 based and inclusive.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/workingwithattachments.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/workingwithattachments.html b/content/1.8/cookbook/workingwithattachments.html index 003f69b..790298d 100644 --- a/content/1.8/cookbook/workingwithattachments.html +++ b/content/1.8/cookbook/workingwithattachments.html @@ -182,7 +182,7 @@ attribute of the PDComplexFileSpecification -
PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
+
PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
 
 //first create the file specification, which holds the embedded file
 PDComplexFileSpecification fs = new PDComplexFileSpecification();
@@ -203,7 +203,8 @@ Attachments are part of the named tree that is attached to the document catalog.
 PDDocumentNameDictionary names = new PDDocumentNameDictionary( doc.getDocumentCatalog() );
 names.setEmbeddedFiles( efTree );
 doc.getDocumentCatalog().setNames( names );
-
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/workingwithfonts.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/workingwithfonts.html b/content/1.8/cookbook/workingwithfonts.html index ad99e5f..9c6cf78 100644 --- a/content/1.8/cookbook/workingwithfonts.html +++ b/content/1.8/cookbook/workingwithfonts.html @@ -233,7 +233,7 @@

This small sample shows how to create a new document and print the text “Hello World” using one of the PDF base fonts.

-
// Create a document and add a page to it
+
// Create a document and add a page to it
 PDDocument document = new PDDocument();
 PDPage page = new PDPage();
 document.addPage( page );
@@ -257,13 +257,14 @@
 // Save the results and ensure that the document is properly closed:
 document.save( "Hello World.pdf");
 document.close();
-
+ +

Hello World Using a TrueType Font

This small sample shows how to create a new document and print the text “Hello World” using a TrueType font.

-
// Create a document and add a page to it
+
// Create a document and add a page to it
 PDDocument document = new PDDocument();
 PDPage page = new PDPage();
 document.addPage( page );
@@ -287,7 +288,8 @@
 // Save the results and ensure that the document is properly closed:
 document.save( "Hello World.pdf");
 document.close();
-
+ +

While it is recommended to embed all fonts for greatest portability not all PDF producer applications will do this. When displaying a PDF it is necessary to find an external font to use. @@ -301,7 +303,7 @@ use when no mapping exists.

This small sample shows how to create a new document and print the text “Hello World” using a PostScript Type1 font.

-
// Create a document and add a page to it
+
// Create a document and add a page to it
 PDDocument document = new PDDocument();
 PDPage page = new PDPage();
 document.addPage( page );
@@ -325,7 +327,8 @@ use when no mapping exists.

// Save the results and ensure that the document is properly closed: document.save( "Hello World.pdf"); document.close(); -
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/cookbook/workingwithmetadata.html ---------------------------------------------------------------------- diff --git a/content/1.8/cookbook/workingwithmetadata.html b/content/1.8/cookbook/workingwithmetadata.html index e8bbf36..0338026 100644 --- a/content/1.8/cookbook/workingwithmetadata.html +++ b/content/1.8/cookbook/workingwithmetadata.html @@ -170,7 +170,7 @@ Getting basic Metadata

To set or retrieve basic information about the document the PDDocumentInformation object provides a high level API to that information:

-
PDDocumentInformation info = document.getDocumentInformation();
+
PDDocumentInformation info = document.getDocumentInformation();
 System.out.println( "Page Count=" + document.getNumberOfPages() );
 System.out.println( "Title=" + info.getTitle() );
 System.out.println( "Author=" + info.getAuthor() );
@@ -181,7 +181,8 @@ provides a high level API to that information:

System.out.println( "Creation Date=" + info.getCreationDate() ); System.out.println( "Modification Date=" + info.getModificationDate()); System.out.println( "Trapped=" + info.getTrapped() ); -
+ +

Accessing PDF Metadata

@@ -192,19 +193,20 @@ See Adobe Documentation: XMP Specification

PDF documents can have XML metadata associated with certain objects within a PDF document. For example, the following PD Model objects have the ability to contain metadata:

-
PDDocumentCatalog
+
PDDocumentCatalog
 PDPage
 PDXObject
 PDICCBased
 PDStream
-
+ +

The metadata that is stored in PDF objects conforms to the XMP specification, it is recommended that you review that specification. Currently there is no high level API for managing the XML metadata, PDFBox uses standard java InputStream/OutputStream to retrieve or set the XML metadata.

-
PDDocument doc = PDDocument.load( ... );
+
PDDocument doc = PDDocument.load( ... );
 PDDocumentCatalog catalog = doc.getDocumentCatalog();
 PDMetadata metadata = catalog.getMetadata();
 
@@ -215,7 +217,8 @@ or set the XML metadata.

InputStream newXMPData = ...; PDMetadata newMetadata = new PDMetadata(doc, newXMLData, false ); catalog.setMetadata( newMetadata ); -
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/dependencies.html ---------------------------------------------------------------------- diff --git a/content/1.8/dependencies.html b/content/1.8/dependencies.html index be9d801..0d5fe67 100644 --- a/content/1.8/dependencies.html +++ b/content/1.8/dependencies.html @@ -189,12 +189,13 @@ included in the Java platform.

To add the pdfbox, fontbox, jempbox and commons-logging jars to your application, the easiest thing is to declare the Maven dependency shown below. This gives you the main pdfbox library directly and the other required jars as transitive dependencies.

-
<dependency>
+
<dependency>
   <groupId>org.apache.pdfbox</groupId>
   <artifactId>pdfbox</artifactId>
   <version>...</version>
 </dependency>
-
+ +

Set the version field to the latest stable PDFBox version.

@@ -217,7 +218,7 @@ pdfbox library directly and the other required jars as transitive dependencies.<

The most notable such optional feature is support for PDF encryption. Instead of implementing its own encryption algorithms, PDFBox uses libraries from the Legion of the Bouncy Castle. Both the bcprov and bcmail libraries are needed and can be included using the Maven dependencies shown below.

-
<dependency>
+
<dependency>
   <groupId>org.bouncycastle</groupId>
   <artifactId>bcprov-jdk15</artifactId>
   <version>1.44</version>
@@ -227,19 +228,21 @@ pdfbox library directly and the other required jars as transitive dependencies.<
   <artifactId>bcmail-jdk15</artifactId>
   <version>1.44</version>
 </dependency>
-
+ +

Support for Bidirectional Languages

Another important optional feature is support for bidirectional languages like Arabic. PDFBox uses the ICU4J library from the International Components for Unicode (ICU) project to support such languages in PDF documents. To add the ICU4J jar to your project, use the following Maven dependency.

-
<dependency>
+
<dependency>
   <groupId>com.ibm.icu</groupId>
   <artifactId>icu4j</artifactId>
   <version>3.8</version>
 </dependency>
-
+ +

PDFBox also contains extra support for use with the Lucene and Ant projects. Since in these cases PDFBox is just an add-on feature to these projects, you should first set up your application to use Lucene or Ant and then add PDFBox support as described on this page.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/1.8/faq.html ---------------------------------------------------------------------- diff --git a/content/1.8/faq.html b/content/1.8/faq.html index 5c893b0..af40b5a 100644 --- a/content/1.8/faq.html +++ b/content/1.8/faq.html @@ -181,22 +181,25 @@

I am getting the below Log4J warning message, how do I remove it?

-
log4j:WARN No appenders could be found for logger (org.apache.pdfbox.util.ResourceLoader).
+
log4j:WARN No appenders could be found for logger (org.apache.pdfbox.util.ResourceLoader).
 log4j:WARN Please initialize the log4j system properly.
-
+ +

This message means that you need to configure the log4j logging system. See the log4j documentation for more information.

PDFBox comes with a sample log4j configuration file. To use it you set a system property like this

-
java -Dlog4j.configuration=log4j.xml org.apache.pdfbox.ExtractText <PDF-file> <output-text-file>
-
+
java -Dlog4j.configuration=log4j.xml org.apache.pdfbox.ExtractText <PDF-file> <output-text-file>
+
+

If this is not working for you then you may have to specify the log4j config file using a URL path, like this:

-
log4j.configuration=file:///<path to config file>
-
+
log4j.configuration=file:///<path to config file>
+
+

Is PDFBox thread safe?

@@ -212,7 +215,7 @@ don’t then the document will not be closed properly. Also, you must close all PDDocument objects that get created. The following code creates two PDDocument objects; one from the “new PDDocument()” and the second by the load method.

-
PDDocument doc = new PDDocument();
+
PDDocument doc = new PDDocument();
 try
 {
    doc = PDDocument.load( "my.pdf" );
@@ -224,7 +227,8 @@ PDDocument objects; one from the “new PDDocument()” and the second by the lo
       doc.close();
    }
 }
-
+ +

Text Extraction

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/2.0/cookbook/encryption.html ---------------------------------------------------------------------- diff --git a/content/2.0/cookbook/encryption.html b/content/2.0/cookbook/encryption.html index 9386839..e93bdb9 100644 --- a/content/2.0/cookbook/encryption.html +++ b/content/2.0/cookbook/encryption.html @@ -164,7 +164,7 @@

This small sample shows how to encrypt a file so that it can be viewed, but not printed.

-
PDDocument doc = PDDocument.load(new File("filename.pdf"));
+
PDDocument doc = PDDocument.load(new File("filename.pdf"));
 
 // Define the length of the encryption key.
 // Possible values are 40, 128 or 256.
@@ -184,7 +184,8 @@
 
 doc.save("filename-encrypted.pdf");
 doc.close();
-
+ +
http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/2.0/dependencies.html ---------------------------------------------------------------------- diff --git a/content/2.0/dependencies.html b/content/2.0/dependencies.html index 6d5bcb1..4534ecb 100644 --- a/content/2.0/dependencies.html +++ b/content/2.0/dependencies.html @@ -188,12 +188,13 @@ included in the Java platform.

Include Dependencies Using Maven

To add the pdfbox, fontbox, xmpbox and commons-logging jars to your application, the easiest thing is to declare the Maven dependency shown below. This gives you the main pdfbox library directly and the other required jars as transitive dependencies.

-
<dependency>
+
<dependency>
     <groupId>org.apache.pdfbox</groupId>
     <artifactId>pdfbox</artifactId>
     <version>...</version>
 </dependency>
-
+ +

Set the version field to the latest stable PDFBox version.

@@ -223,7 +224,7 @@ There is a know issue when using the JBIG2-Image-Decoder as an ImageIO Plugin. T

Encrypting and sigining PDFs requires the bcprov, bcmail and bcpkix libraries from the Legion of the Bouncy Castle. These can be included in your Maven project using the following dependencies:

-
<dependency>
+
<dependency>
     <groupId>org.bouncycastle</groupId>
     <artifactId>bcprov-jdk15on</artifactId>
     <version>1.54</version>
@@ -240,14 +241,16 @@ There is a know issue when using the JBIG2-Image-Decoder as an ImageIO Plugin. T
     <artifactId>bcpkix-jdk15on</artifactId>
     <version>1.54</version>
 </dependency>
-
+ +

Java Cryptography Extension (JCE)

256-bit AES encryption requires a JDK with “unlimited strength” cryptography, which requires extra files to be installed. For JDK 7, see Java Cryptography Extension (JCE). If these files are not installed, building PDFBox will throw an exception with the following message:

-
JCE unlimited strength jurisdiction policy files are not installed
-
+
JCE unlimited strength jurisdiction policy files are not installed
+
+

Dependencies for Ant Builds

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/2.0/faq.html ---------------------------------------------------------------------- diff --git a/content/2.0/faq.html b/content/2.0/faq.html index 22d32fb..7f17a08 100644 --- a/content/2.0/faq.html +++ b/content/2.0/faq.html @@ -203,22 +203,25 @@

I am getting the below Log4J warning message, how do I remove it?

-
log4j:WARN No appenders could be found for logger (org.apache.pdfbox.util.ResourceLoader).
+
log4j:WARN No appenders could be found for logger (org.apache.pdfbox.util.ResourceLoader).
 log4j:WARN Please initialize the log4j system properly.
-
+ +

This message means that you need to configure the log4j logging system. See the log4j documentation for more information.

PDFBox comes with a sample log4j configuration file. To use it you set a system property like this

-
java -Dlog4j.configuration=log4j.xml org.apache.pdfbox.ExtractText <PDF-file> <output-text-file>
-
+
java -Dlog4j.configuration=log4j.xml org.apache.pdfbox.ExtractText <PDF-file> <output-text-file>
+
+

If this is not working for you then you may have to specify the log4j config file using a URL path, like this:

-
log4j.configuration=file:///<path to config file>
-
+
log4j.configuration=file:///<path to config file>
+
+

@@ -238,7 +241,7 @@ don’t then the document will not be closed properly. Also, you must close all PDDocument objects that get created. The following code creates two PDDocument objects; one from the “new PDDocument()” and the second by the load method.

-
PDDocument doc = new PDDocument();
+
PDDocument doc = new PDDocument();
 try
 {
    doc = PDDocument.load( "my.pdf" );
@@ -250,7 +253,8 @@ PDDocument objects; one from the “new PDDocument()” and the second by the lo
       doc.close();
    }
 }
-
+ +

Font Handling

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/2.0/getting-started.html ---------------------------------------------------------------------- diff --git a/content/2.0/getting-started.html b/content/2.0/getting-started.html index 42228b2..6decdce 100644 --- a/content/2.0/getting-started.html +++ b/content/2.0/getting-started.html @@ -162,12 +162,13 @@

To use the latest release you’ll need to add the following dependency:

-
<dependency>
+
<dependency>
   <groupId>org.apache.pdfbox</groupId>
   <artifactId>pdfbox</artifactId>
   <version>2.0.4</version>
 </dependency>
-
+ +

PDFBox and Java 8

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/2.0/migration.html ---------------------------------------------------------------------- diff --git a/content/2.0/migration.html b/content/2.0/migration.html index a0975d8..e368454 100644 --- a/content/2.0/migration.html +++ b/content/2.0/migration.html @@ -222,8 +222,9 @@ results when switching to PDFBox 2.0.0.

TrueType fonts shall now be loaded using

-
PDType0Font.load
-
+
PDType0Font.load
+
+

to leverage that.

@@ -253,34 +254,37 @@ and so on. The add method now supports al

Prior to PDFBox 2.0 parsing the page content was done using

-
PDStream contents = page.getContents();
+
PDStream contents = page.getContents();
 PDFStreamParser parser = new PDFStreamParser(contents.getStream());
 parser.parse();
 List<Object> tokens = parser.getTokens();
-
+ +

With PDFBox 2.0 the code is reduced to

-
PDFStreamParser parser = new PDFStreamParser(page);
+
PDFStreamParser parser = new PDFStreamParser(page);
 parser.parse();
 List<Object> tokens = parser.getTokens();
-
+ +

In addition this also works if the page content is defined as an array of content streams.

Iterating Pages

With PDFBox 2.0.0 the prefered way to iterate through the pages of a document is

-
for(PDPage page : document.getPages())
+
for(PDPage page : document.getPages())
 {
     ... (do something)
 }
-
+ +

PDF Rendering

With PDFBox 2.0.0 PDPage.convertToImage and PDFImageWriter have been removed. Instead the new PDFRenderer class shall be used.

-
PDDocument document = PDDocument.load(new File(pdfFilename));
+
PDDocument document = PDDocument.load(new File(pdfFilename));
 PDFRenderer pdfRenderer = new PDFRenderer(document);
 int pageCounter = 0;
 for (PDPage page : document.getPages())
@@ -292,7 +296,8 @@ and so on. The add method now supports al
     ImageIOUtil.writeImage(bim, pdfFilename + "-" + (pageCounter++) + ".png", 300);
 }
 document.close();
-
+ +

ImageIOUtil has been moved into the org.apache.pdfbox.tools.imageio package. This is in the pdfbox-tools download. If you are using maven, the artifactId has the same name.

@@ -323,19 +328,21 @@ https://bugs.openjdk.java.net/browse/JDK-8041125

Users of PDFPrinter.silentPrint() should now use this code:

-
PrinterJob job = PrinterJob.getPrinterJob();
+
PrinterJob job = PrinterJob.getPrinterJob();
 job.setPageable(new PDFPageable(document));
 job.print();
-
+ +

While users of PDFPrinter.print() should now use this code:

-
PrinterJob job = PrinterJob.getPrinterJob();
+
PrinterJob job = PrinterJob.getPrinterJob();
 job.setPageable(new PDFPageable(document));
 if (job.printDialog()) {
     job.print();
 }
-
+ +

Advanced use case examples can be found in th examples package under org/apache/pdfbox/examples/printing/Printing.java

@@ -343,7 +350,7 @@ https://bugs.openjdk.java.net/browse/JDK-8041125

In 1.8, to get the text colors, one method was to pass an expanded .properties file to the PDFStripper constructor. To achieve the same in PDFBox 2.0 you can extend PDFTextStripperand add the following Operators to the constructor:

-
addOperator(new SetStrokingColorSpace());
+
addOperator(new SetStrokingColorSpace());
 addOperator(new SetNonStrokingColorSpace());
 addOperator(new SetStrokingDeviceCMYKColor());
 addOperator(new SetNonStrokingDeviceCMYKColor());
@@ -355,7 +362,8 @@ in PDFBox 2.0 you can extend PDFTextStripperaddOperator(new SetStrokingColorN());
 addOperator(new SetNonStrokingColor());
 addOperator(new SetNonStrokingColorN());
-
+ +

Interactive Forms

Large parts of the support for interactive forms (AcroForms) have been rewritten. The most notable change from 1.8.x is that @@ -364,13 +372,14 @@ tree are now represented by the PDNonTerminalFie

With PDFBox 2.0.0 the prefered way to iterate through the fields is now

-
PDAcroForm form;
+
PDAcroForm form;
 ...
 for (PDField field : form.getFieldTree())
 {
     ... (do something)
 }
-
+ +

Most PDField subclasses now accept Java generic types such as String as parameters instead of the former COSBase subclasses.

@@ -388,8 +397,9 @@ annotations associated with a field.

The ReplaceText example has been removed as it gave the incorrect illusion that text can be replaced easily. Words are often split, as seen by this excerpt of a content stream:

-
[ (Do) -29 (c) -1 (umen) 30 (tation) ] TJ
-
+
[ (Do) -29 (c) -1 (umen) 30 (tation) ] TJ
+
+

Other problems will appear with font subsets: for example, if only the glyphs for a, b and c are used, these would be encoded as hex 0, 1 and 2, so you won’t find “abc”. Additionally, you can’t replace “c” with “d” because it isn’t part of the subset.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/building.html ---------------------------------------------------------------------- diff --git a/content/building.html b/content/building.html index 1b9b004..8c16a42 100644 --- a/content/building.html +++ b/content/building.html @@ -164,9 +164,10 @@

You can obtain the latest source of PDFBox from our SVN repo The current trunk is v3.0.0-SNAPSHOT. There is a seperate branch for the 1.8.x series. You can fetch the latest 2.0 trunk using Subversion:

-
svn checkout http://svn.apache.org/repos/asf/pdfbox/trunk/
+
svn checkout http://svn.apache.org/repos/asf/pdfbox/trunk/
 cd trunk
-
+ +

Build dependencies

@@ -189,15 +190,17 @@ cd trunk

Building PDFBox 2.0 requires a JDK with “unlimited strength” cryptography, which requires extra files to be installed. For JDK 7, see Java Cryptography Extension (JCE). If these files are not installed, building PDFBox will fail the following test:

-
TestPublicKeyEncryption.setUp:70 JCE unlimited strength jurisdiction policy files are not installed
-
+
TestPublicKeyEncryption.setUp:70 JCE unlimited strength jurisdiction policy files are not installed
+
+

Building with Maven

In the root directory of PDFBox:

-
mvn clean install
-
+
mvn clean install
+
+

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/codingconventions.html ---------------------------------------------------------------------- diff --git a/content/codingconventions.html b/content/codingconventions.html index afec7a7..0ec0c52 100644 --- a/content/codingconventions.html +++ b/content/codingconventions.html @@ -311,7 +311,7 @@

Here’s an example of PDFBox’s formatting style:

-
public class Foo extends Bar
+
public class Foo extends Bar
 {
     public static void main(String args[])
     {
@@ -328,7 +328,8 @@
         }
     }
 }
-
+ +

Eclipse Formatter

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/doap_PDFBox.rdf ---------------------------------------------------------------------- diff --git a/content/doap_PDFBox.rdf b/content/doap_PDFBox.rdf index 1565a48..82789c7 100644 --- a/content/doap_PDFBox.rdf +++ b/content/doap_PDFBox.rdf @@ -37,6 +37,20 @@ Apache PDFBox + 2017-11-02 + 2.0.8 + + + + + Apache PDFBox + 2017-07-20 + 2.0.7 + + + + + Apache PDFBox 2017-05-15 2.0.6 http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/63cbc6cc/content/siteupdate.html ---------------------------------------------------------------------- diff --git a/content/siteupdate.html b/content/siteupdate.html index dca9bdb..095dfcc 100644 --- a/content/siteupdate.html +++ b/content/siteupdate.html @@ -174,31 +174,35 @@

Checkout from the Git Repository

Before you can edit the site, you need to check it out from the Git repository:

-
git clone https://git-wip-us.apache.org/repos/asf/pdfbox-docs
-
+
git clone https://git-wip-us.apache.org/repos/asf/pdfbox-docs
+
+

Local Changes

You can now do the changes and additions to the sources of the PDFBox website. To test these locally use

-
jekyll serve
-
+
jekyll serve
+
+

which will compile the changes and run a local webserver at

-
http://localhost:4000
-
+
http://localhost:4000
+
+

Publish the Website (For Comitters Only)

After you have done the local changes follow these steps to publish the content:

Add the following server configuration in your ~/.m2/settings.xml file

-
<server>
+
<server>
   <id>pdfbox-site</id>
   <username>** USERNAME **</username>
   <password>** PASSWORD **</password>
 </server>
-
+ +

pdfbox-site is referenced from the PDFBox pom.xml file.

@@ -206,21 +210,24 @@

Ensure that the new website content can build locally

-
jekyll build
-
+
jekyll build
+
+

This will read the sources and generate the new content in the ./staging directory.

When you are happy with the new content update the source repository

-
git commit -m "..."
+
git commit -m "..."
 git push origin master
-
+ +

Upload the new content to the production site

-
mvn scm-publish:publish-scm
-
+
mvn scm-publish:publish-scm
+
+

This will checkout the current content into the ./targetdirectory, apply the changes from ./staging and publish the changes to the PDFBox production website.

@@ -230,8 +237,9 @@ the changes to the PDFBox production website.

Run

-
$ mvn clean javadoc:aggregate scm-publish:publish-scm
-
+
$ mvn clean javadoc:aggregate scm-publish:publish-scm
+
+

from the <SVN_ROOT>/../pdfbox directory.