lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "ExtractingRequestHandler" by SimonRosenthal
Date Sun, 16 Nov 2008 18:53:36 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by SimonRosenthal:
http://wiki.apache.org/solr/ExtractingRequestHandler

The comment on the change is:
Corrected Tika URL

------------------------------------------------------------------------------
  
  = Introduction =
  
- A common need of users is the ability to ingest binary and/or structured documents such
as Office, PDF and other proprietary formats.  The [http://www.lucene.apache.org/tika Apache
Tika] project provides a framework for wrapping many different file format parsers, such as
PDFBox, POI and others.
+ A common need of users is the ability to ingest binary and/or structured documents such
as Office, PDF and other proprietary formats.  The [http://incubator.apache.org/tika/ Apache
Tika] project provides a framework for wrapping many different file format parsers, such as
PDFBox, POI and others.
  
  Solr's !ExtractingRequestHandler provides a wrapper around Tika to allow uses to upload
binary files to Solr and have Solr extract text from it and then index it.
  

Mime
View raw message