lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "ExtractingRequestHandler" by GrantIngersoll
Date Mon, 07 Dec 2009 15:47:55 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "ExtractingRequestHandler" page has been changed by GrantIngersoll.
http://wiki.apache.org/solr/ExtractingRequestHandler?action=diff&rev1=50&rev2=51

--------------------------------------------------

  
  The tika.config entry points to a file containing a Tika configuration.  You would only
need this if you have customized your own Tika configuration.  The Tika config contains info
about parsers, mime types, etc.
  
- You may also need to adjust the {{{multipartUploadLimitInKB}}} attribute as follows if you
are submitting very large documents. The {{{enableRemoteStreaming}}} is not used by the !ExtractingRequestHandler.
+ You may also need to adjust the {{{multipartUploadLimitInKB}}} attribute as follows if you
are submitting very large documents. The {{{enableRemoteStreaming}}} can be used by the !ExtractingRequestHandler.
+ In your solrconfig.xml, you must turn it on:
  {{{
    <requestDispatcher handleSelect="true" >
-     <requestParsers enableRemoteStreaming="false" multipartUploadLimitInKB="20480" />
+     <requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="20480" />
      ....
  }}}
+ 
+ See ContentStreams for more info.  As an example of using remote streaming, you can do:
+ {{{
+  curl "http://localhost:8983/solr/update/extract?stream.file=/path/to/file/StatesLeftToVisit.doc&stream.contentType=application/msword&literal.id=states.doc"
+ }}}
+ 
+ 
  
  Lastly, the date.formats allows you to specify various java.text.SimpleDateFormat date formats
for working with transforming extracted input to a Date.  Solr comes configured with the following
date formats (see the DateUtil class in Solr)
  {{{

Mime
View raw message