lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-java Wiki] Update of "LuceneFAQ" by NickBurch
Date Thu, 28 Jun 2007 12:00:40 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The following page has been changed by NickBurch:
http://wiki.apache.org/lucene-java/LuceneFAQ

The comment on the change is:
Jakarta POI -> Apache POI, and note on using POI for visio files

------------------------------------------------------------------------------
  
  In order to index Word documents you need to first parse them to extract text that you want
to index from them.  Here are some Word parsers that can help you with that:
  
- [http://jakarta.apache.org/poi/ Jakarta Apache POI] has an early development level Microsoft
Word parser for versions of Word from Office 97, 2000, and XP.
+ [http://poi.apache.org/hwpf/ Apache POI] has an early development level Microsoft Word parser
for versions of Word from Office 97, 2000, and XP.
- 
  
  ==== How can I index MS-Excel documents? ====
  
  In order to index Excel documents you need to first parse them to extract text that you
want to index from them.  Here are some Excel parsers that can help you with that:
  
- [http://jakarta.apache.org/poi/ Jakarta Apache POI] has an excellent Microsoft Excel parser
for versions of Excel from Office 97, 2000, and XP.  You can also modify Excel files with
this tool.
+ [http://poi.apache.org/hssf/ Apache POI] has an excellent Microsoft Excel parser for versions
of Excel from Office 97, 2000, and XP.  You can also modify Excel files with this tool.
- 
  
  ==== How can I index MS-Powerpoint documents? ====
  
- In order to index Powerpoint documents you need to first parse them to extract text that
you want to index from them.  You can use the [http://jakarta.apache.org/poi/ Jakarta Apache
POI], as it contains a parser for Powerpoint documents.
+ In order to index Powerpoint documents you need to first parse them to extract text that
you want to index from them.  You can use the [http://poi.apache.org/hslf/  Apache POI], as
it contains a parser for Powerpoint documents.
+ 
+ ==== How can I index MS-Visio documents? ====
+ 
+ In order to index Visio documents you need to first parse them to extract text that you
want to index from them.  You can use the [http://poi.apache.org/hdgf/ Apache POI], as it
contains a parser for Visio documents.
  
  
  ==== How can I index Email (from MS-Exchange or another IMAP server) ? ====

Mime
View raw message