lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Solr Wiki] Update of "DataImportHandler" by ShalinMangar
Date Wed, 26 Mar 2008 09:11:06 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by ShalinMangar:

The comment on the change is:
Added details on using HttpDataSource

   * data-config.xml (DB Table/column to SOLR document mapping comes here)
- = Usage =
+ = Usage with databases =
  In order to use this handler, the following steps are required.
   * Define a data-config.xml and specify the location this file in solrconfig.xml under DataImportHandler
   * Give connection information such as JDBC Driver, JDBC URL, DB Username and password in
solrconfig.xml under DataImportHandler section
@@ -256, +256 @@

   * For each row given by ''deltaQuery'', the parentDeltaQuery is executed.
   * If any row in the root/child entity changes, we regenerate the complete SOLR document
which contained that row.
+ = Usage with RSS/ATOM =
+ You can use DataImportHandler to index data from HTTP based data sources. This includes
using indexing from REST/XML APIs as well as from RSS/ATOM Feeds.
+ == Configuration in solrconfig.xml ==
+ A sample DataImportHandler configuration in solrconfig.xml looks like this
+ {{{
+ <requestHandler name="/dataimport" class="org.apache.solr.handler.DataImportHandler">
+     <lst name="defaults">
+       <str name="config">/home/username/data-config.xml</str>
+       <lst name="datasource">
+          <str name="type">HttpDataSource</str>
+          <str name="baseUrl">http://host:port/</str>
+          <str name="encoding">UTF-8</str>
+          <str name="connectionTimeout">5000</str>
+          <str name="readTimeout">10000</str>
+       </lst>
+     </lst>
+   </requestHandler>
+ }}}
+ ''Note''
+  * baseUrl is optional, you should use it when the host/port changes between Dev/QA/Prod
environments. Using this attribute isolates the changes to be made to the solrconfig.xml
+  * encoding is optional, by default the response from the URL is read in UTF-8 encoding

+  * connectionTimeout is optional, the default value is 5000ms 
+  * readTimeout is optional, the default value is 10000ms 
+ == Configuration in data-config.xml ==
+  * TODO
  = Extending the tool with APIs =
  The examples we explored are admittedly, trivial . It is not possible to have all user needs
met by an xml configuration alone. So we expose a few interfaces which can be implemented
by the user to enhance the functionality.

View raw message