pdfbox-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From msahy...@apache.org
Subject pdfbox-docs git commit: Site checkin for project Apache PDFBox Website
Date Sun, 29 Nov 2015 13:40:35 GMT
Repository: pdfbox-docs
Updated Branches:
  refs/heads/asf-site 1203f0b90 -> 9e6dbf4ec


Site checkin for project Apache PDFBox Website


Project: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/commit/9e6dbf4e
Tree: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/tree/9e6dbf4e
Diff: http://git-wip-us.apache.org/repos/asf/pdfbox-docs/diff/9e6dbf4e

Branch: refs/heads/asf-site
Commit: 9e6dbf4ecdfb25b25eb023454402e5daab3560be
Parents: 1203f0b
Author: Maruan Sahyoun <sahyoun@fileaffairs.de>
Authored: Sun Nov 29 14:40:32 2015 +0100
Committer: Maruan Sahyoun <sahyoun@fileaffairs.de>
Committed: Sun Nov 29 14:40:32 2015 +0100

----------------------------------------------------------------------
 content/2.0/migration.html | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/9e6dbf4e/content/2.0/migration.html
----------------------------------------------------------------------
diff --git a/content/2.0/migration.html b/content/2.0/migration.html
index 8e5a665..184a617 100644
--- a/content/2.0/migration.html
+++ b/content/2.0/migration.html
@@ -216,6 +216,23 @@ and so on. The <code>add</code> method now supports all the
different type of re
 <li><code>LosslessFactory.createFromImage</code> (this is best if you start
with a BufferedImage).</li>
 </ul>
 
+<h3 id="parsing-the-page-content">Parsing the Page Content</h3>
+
+<p>Getting the content for a page has been simplified.</p>
+
+<p>Prior to PDFBox 2.0 parsing the page content was done using</p>
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span
class="n">PDStream</span> <span class="n">contents</span> <span class="o">=</span>
<span class="n">page</span><span class="o">.</span><span class="na">getContents</span><span
class="o">();</span>
+<span class="n">PDFStreamParser</span> <span class="n">parser</span>
<span class="o">=</span> <span class="k">new</span> <span class="n">PDFStreamParser</span><span
class="o">(</span><span class="n">contents</span><span class="o">.</span><span
class="na">getStream</span><span class="o">());</span>
+<span class="n">parser</span><span class="o">.</span><span class="na">parse</span><span
class="o">();</span>
+<span class="n">List</span><span class="o">&lt;</span><span
class="n">Object</span><span class="o">&gt;</span> <span class="n">tokens</span>
<span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span
class="na">getTokens</span><span class="o">();</span>
+</code></pre></div>
+<p>With PDFBox 2.0 the code is reduced to </p>
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span
class="n">PDFStreamParser</span> <span class="n">parser</span> <span
class="o">=</span> <span class="k">new</span> <span class="n">PDFStreamParser</span><span
class="o">(</span><span class="n">page</span><span class="o">);</span>
+<span class="n">parser</span><span class="o">.</span><span class="na">parse</span><span
class="o">();</span>
+<span class="n">List</span><span class="o">&lt;</span><span
class="n">Object</span><span class="o">&gt;</span> <span class="n">tokens</span>
<span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span
class="na">getTokens</span><span class="o">();</span>
+</code></pre></div>
+<p>In addition this also works if the page content is defined as an <strong>array
of content streams</strong>.</p>
+
 <h3 id="iterating-pages">Iterating Pages</h3>
 
 <p>With PDFBox 2.0.0 the prefered way to iterate through the pages of a document is</p>


Mime
View raw message