poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From n...@apache.org
Subject svn commit: r409632 - in /jakarta/poi/trunk/src/documentation/content/xdocs/hwpf: book.xml quick-guide.xml
Date Fri, 26 May 2006 10:43:42 GMT
Author: nick
Date: Fri May 26 03:43:42 2006
New Revision: 409632

URL: http://svn.apache.org/viewvc?rev=409632&view=rev
Add a quick guide to using the text extractor and friends, since that's a common use


Modified: jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/book.xml
URL: http://svn.apache.org/viewvc/jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/book.xml?rev=409632&r1=409631&r2=409632&view=diff
--- jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/book.xml (original)
+++ jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/book.xml Fri May 26 03:43:42 2006
@@ -7,6 +7,7 @@
 	<menu label="HWPF">
 		<menu-item label="Overview" href="index.html"/>
+		<menu-item label="Quick Guide" href="quick-guide.html"/>
 		<menu-item label="HWPF Format" href="docoverview.html"/>
 		<menu-item label="HWPF Project plan" href="projectplan.html"/>

Added: jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/quick-guide.xml
URL: http://svn.apache.org/viewvc/jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/quick-guide.xml?rev=409632&view=auto
--- jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/quick-guide.xml (added)
+++ jakarta/poi/trunk/src/documentation/content/xdocs/hwpf/quick-guide.xml Fri May 26 03:43:42
@@ -0,0 +1,45 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Copyright (C) 2004 The Apache Software Foundation. All rights reserved. -->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd">
+    <header>
+        <title>POI-HWPF - A Quick Guide</title>
+        <subtitle>Overview</subtitle>
+        <authors>
+            <person name="Nick Burch" email="nick at torchbox dot com"/>
+        </authors>
+    </header>
+    <body>
+        <section><title>Basic Text Extraction</title>
+        <p>For basic text extraction, make use of 
+<code>org.apache.poi.hwpf.extractor.WordExtractor</code>. It accepts an input
+stream or a <code>HWPFDocument</code>. The <code>getText()</code>

+method can be used to 
+get the text from all the paragraphs, or <code>getParagraphText()</code>
+can be used to fetch the text from each paragraph in turn. The other
+option is <code>getTextFromPieces()</code>, which is very fast, but
+tends to return things that aren't text from the page. YMMV.
+		</p>
+		</section>
+		<section><title>Specific Text Extraction</title>
+		<p>To get specific bits of text, first create a 
+<code>org.apache.poi.hwpf.HWPFDocument</code>. Fetch the range 
+with <code>getRange()</code>, then get paragraphs from that. You
+can then get text and other properties.
+		</p>
+		</section>
+		<section><title>Changing Text</title>
+		<p>It is possible to change the text via 
+		<code>insertBefore()</code> and <code>insertAfter()</code>
+		on a <code>Range</code> object (either a <code>Range</code>,
+		<code>Paragraph</code> or <code>CharacterRun</code>).
+		It is also possible to delete a <code>Range</code>, but this
+		code is know to have bugs in it.
+		</p>
+		</section>
+	</body>

To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

View raw message