lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Solr Wiki] Update of "DataImportHandler" by NoblePaul
Date Wed, 19 Mar 2008 05:11:07 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by NoblePaul:

  As the name suggests, this is implemented as a SolrRequestHandler. The configuration is
provided in two places:
   * solrconfig.xml (data source information is read from here e.g. JDBC Driver, JDBC URL,
Username, Password etc.)
   * data-config.xml (DB Table/column to SOLR document mapping comes here)
- inline:DataImportHandlerOverview.png
  = Usage =
  In order to use this handler, the following steps are required.
@@ -362, +362 @@

+ = Architecture =
+ The following diagram describes the logical flow for a sample configuration.
+ inline:DataImportHandlerOverview.png
+ The use case is as follows:
+ There are 3 datasources two RDBMS (jdbc1,jdbc2) and one xml/http (B)
+  * The root entity starts with a table called 'A' in the 'jdbc1' . The entity is conveniently
named as the table itself
+  * Each row that emitted by the 'query' in entity 'A' is fed into its sub entities B, C
+  * The queries in B and C use a column in 'A' to construct their queries using placeholders
like `${A.a}`
+    * B has a url  (B is an xml/http datasource) 
+    * C has a query
+  * C has two transformers ('f' and 'g' )   
+  * Each row that comes out of C is fed into 'f' and 'g' sequentially (transformers are chained)
. Each transformer produces changes the input. Note that the transformer 'g' produces 2 output
rows for an input row `f(C.1))
+  * The end output of each entity is combined together to construct a document
+    * Note that the intermediate rows from C i.e `C.1, C.2, f(C.1) , f(C1)` are ignored
  = Where to find it? =
  DataImportHandler is not in SOLR right now. You can either:
   * Download the patch in [ SOLR-469] in the

View raw message