lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "ExtractingRequestHandler" by GrantIngersoll
Date Thu, 10 Sep 2009 13:11:04 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by GrantIngersoll:
http://wiki.apache.org/solr/ExtractingRequestHandler

------------------------------------------------------------------------------
  
  = Sending documents to Solr =
  
- // TODO: discribe the different ways to send the documents to solr (POST body, form encoded,
remoteStreaming)
+ // TODO: describe the different ways to send the documents to solr (POST body, form encoded,
remoteStreaming)
   * curl http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text
 --data-binary @tutorial.html  -H 'Content-type:text/html'  
         <!> NOTE, this literally streams the file, which does not, then, provide info
to Solr about the name of the file.
- 
+  * SolrJ:  Use the ContentStreamUpdateRequest (see SolrExampleTests.java for full example):{{{
+     ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract");
+     up.addFile(new File("mailing_lists.pdf"));
+     up.setParam("literal.id", "mailing_lists.pdf");
+     up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
+     result = server.request(up);
+     assertNotNull("Couldn't upload mailing_lists.pdf", result);
+     rsp = server.query( new SolrQuery( "*:*") );
+     Assert.assertEquals( 1, rsp.getResults().getNumFound() );
+ }}}
  
  == Additional Resources ==
  * [http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#example.source
Lucid Imagination article]

Mime
View raw message