lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "ExtractingRequestHandler" by YonikSeeley
Date Mon, 10 Aug 2009 13:54:50 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by YonikSeeley:
http://wiki.apache.org/solr/ExtractingRequestHandler

The comment on the change is:
move the TODO out of the finished top part

------------------------------------------------------------------------------
  
   
  And then query via http://localhost:8983/solr/select?q=attr_content:tutorial
- 
- // TODO: move this somewhere else to a more in-depth discussion of different ways to send
the data to Solr (prob with remoteStreaming discussion)
-  * curl http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text
 --data-binary @tutorial.html  -H 'Content-type:text/html'  
-        <!> NOTE, this literally streams the file, which does not, then, provide info
to Solr about the name of the file.
- 
  
  = Input Parameters =
   * map.<source_field>=<target_field> - Maps (moves) one field name to another.
 Example: {{{map.content=text}}} will cause the content field normally generated by Tika to
be moved to the "text" field.
@@ -186, +181 @@

  
  See TikaExtractOnlyExampleOutput.
  
+ = Sending documents to Solr =
+ 
+ // TODO: discribe the different ways to send the documents to solr (POST body, form encoded,
remoteStreaming)
+  * curl http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text
 --data-binary @tutorial.html  -H 'Content-type:text/html'  
+        <!> NOTE, this literally streams the file, which does not, then, provide info
to Solr about the name of the file.
+ 
  
  == Additional Resources ==
  * [http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#example.source
Lucid Imagination article]

Mime
View raw message