poi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From n...@apache.org
Subject svn commit: r1486432 - in /poi/site/src/documentation/content/xdocs: subversion.xml text-extraction.xml
Date Sun, 26 May 2013 16:57:55 GMT
Author: nick
Date: Sun May 26 16:57:55 2013
New Revision: 1486432

URL: http://svn.apache.org/r1486432
Log:
Expand/update the Source Code Repo and Text Extractor docs

Modified:
    poi/site/src/documentation/content/xdocs/subversion.xml
    poi/site/src/documentation/content/xdocs/text-extraction.xml

Modified: poi/site/src/documentation/content/xdocs/subversion.xml
URL: http://svn.apache.org/viewvc/poi/site/src/documentation/content/xdocs/subversion.xml?rev=1486432&r1=1486431&r2=1486432&view=diff
==============================================================================
--- poi/site/src/documentation/content/xdocs/subversion.xml (original)
+++ poi/site/src/documentation/content/xdocs/subversion.xml Sun May 26 16:57:55 2013
@@ -31,22 +31,31 @@
     <section><title>Download the Source</title>
       <p>
          Most users of the source code probably don't need to have day to 
-         day access to the source code as it changes. For these users we 
-         provide easy to unpack source code from releases via our
+         day access to the source code as it changes. Most users will want
+         to make use of our <link href="download.html">source release</link>
+         packages, which contain the complete source tree for each binary
+         release, suitable for browsing or debugging. These source releases
+         are available from our
          <link href="download.html">download page.</link>
       </p>
+      <p>
+         The Apache POI sourcecode is also available as source artifacts
+         in the Maven Central repository, which may be helpful for those
+         users who make use of POI and wish to inspect the source (eg when
+         debugging in an IDE).
+      </p>
     </section>
     <section><title>Access the Version Controlled Source Code</title>
       <p>
-         For information on connecting to the ASF Subversion repositories, 
-         see the 
+         For general information on connecting to the ASF Subversion, 
+         repositories, see the 
          <link href="http://www.apache.org/dev/version-control.html">version control
page.</link>
       </p>
       <p>Subversion is an open-source version control system. It has been contributed
to the Apache Software Foundation and is
-	now available <link href="http://incubator.apache.org/projects/subversion.html">here</link>.
+	now available <link href="http://subversion.apache.org/">here</link>.
       </p>
       <p>
-	The root url of the ASF Subversion repository is 
+       The root url of the ASF Subversion repository is 
        <link href="http://svn.apache.org/repos/asf/">http://svn.apache.org/repos/asf/</link>
        for non-committers and 
        <link href="https://svn.apache.org/repos/asf/">https://svn.apache.org/repos/asf/</link>

@@ -75,11 +84,30 @@
     </section>
     <section><title>Git access to POI sources </title>
       <p>
-        Git read-only access to POI sources is now available. 
-        Please see the <link href="http://git.apache.org/">Git at Apache</link>
page for details. 
-        Git Clone URL: <link href="git://git.apache.org/poi.git">git://git.apache.org/poi.git</link>

-        and Http Clone URL:  <link href="http://git.apache.org/poi.git">http://git.apache.org/poi.git</link>.
+        The master source repository for Apache POI is the Subversion
+        one listed above. To support those users and developers who prefer
+        to use the Git tooling, read-only access to the POI source tree is
+        also available via Git. The Git mirrors normally track SVN to 
+        within a few minutes.
       </p>
+      <p>
+        The official read-only Git repository for Apache POI is available
+        from <link href="http://git.apache.org/">git.apache.org/</link> .
+        The Git Clone URL is: <link href="git://git.apache.org/poi.git">git://git.apache.org/poi.git</link>

+        and Http Clone URL: <link href="http://git.apache.org/poi.git">http://git.apache.org/poi.git</link>
.
+         Please see the <link href="http://git.apache.org/">Git at 
+         Apache</link> page for more details on the service.
+      </p>
+      <p>
+        In addition to the <link href="http://git.apache.org/">git.apache.org/</link>
+        repository, changes are also mirrored in near-realtime to GitHub.
+        The GitHub repository is available at
+        <link href="https://github.com/apache/poi">https://github.com/apache/poi</link>
.
+        Please note that the GitHub repository is read-only, and all 
+        contributions should continue to be sent via Bugzilla for tracking.
+        (Git patches are fine though). Please see the
+        <link href="guidelines.html">contribution guidelines</link> for more

+        information on getting involved in the project.</p>
     </section>
   </body>
   <footer>

Modified: poi/site/src/documentation/content/xdocs/text-extraction.xml
URL: http://svn.apache.org/viewvc/poi/site/src/documentation/content/xdocs/text-extraction.xml?rev=1486432&r1=1486431&r2=1486432&view=diff
==============================================================================
--- poi/site/src/documentation/content/xdocs/text-extraction.xml (original)
+++ poi/site/src/documentation/content/xdocs/text-extraction.xml Sun May 26 16:57:55 2013
@@ -29,14 +29,23 @@
   
   <body>
     <section><title>Overview</title>
-      <p>Apache POI provides text extraction for all the supported file
-       formats. In addition, it provides access to the metadata
-       associated with a given file, such as title and author.</p>
-      <p>In addition to providing direct text extraction classes,
-       POI works closely with the 
-       <link href="http://incubator.apache.org/tika/">Apache Tika</link>
-       text extraction library. Users may wish to simply utilise 
-       the functionality provided by Tika.</p>
+      <p>For a number of years now, Apache POI has provided basic 
+       text extraction for all the project supported file formats. In 
+       addition, as well as the (plain) text, these provides access to 
+       the metadata associated with a given file, such as title and 
+       author.</p>
+      <p>For more advanced text extraction needs, including Rich Text
+       extraction (such as formatting and styling), along with XML and
+       HTML output, Apache POI works closely with 
+       <link href="http://tika.apache.org/">Apache Tika</link> to deliver 
+       POI-powered Tika Parsers for all the project supported file formats.</p>
+      <p>If you are after turn-key text extraction, including the latest
+       support, styles etc, you are strongly advised to make use of 
+       <link href="http://tika.apache.org/">Apache Tika</link>, which builds

+       on top of POI to provide Text and Metadata extraction. If you wish
+       to have something very simple and stand-alone, or you wish to make
+       heavy modificiations, then the POI provided text extractors documented
+       below might be a better fit for your needs.</p>
     </section>
 
     <section><title>Common functionality</title>
@@ -56,12 +65,16 @@
       provides common methods to get at the OOXML metadata.</p>
     </section>
 
-    <section><title>Text Extractor Factory - POI 3.5 or later</title>
-     <p>A new class in POI 3.5, 
-      <em>org.apache.poi.extractor.ExtractorFactory</em> provides a
+    <section><title>Text Extractor Factory</title>
+     <p>As part of the addition of OOXML support in Apache POI 3.5, there
+      is a common class to select the appropriate POI text extractor for 
+      you. <em>org.apache.poi.extractor.ExtractorFactory</em> provides a
       similar function to WorkbookFactory. You simply pass it an
-      InputStream, a file, a POIFSFileSystem or a OOXML Package. It
+      InputStream, a File, a POIFSFileSystem or a OOXML Package. It
       figures out the correct text extractor for you, and returns it.</p>
+     <p>For complete detection and text extractor auto-selection, users
+      are strongly encouraged to investigate
+      <link href="http://tika.apache.org/">Apache Tika</link>.</p>
     </section>
 
     <section><title>Excel</title>



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@poi.apache.org
For additional commands, e-mail: commits-help@poi.apache.org


Mime
View raw message