lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DataImportHandler" by NoblePaul
Date Wed, 26 Mar 2008 10:35:24 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by NoblePaul:
http://wiki.apache.org/solr/DataImportHandler

------------------------------------------------------------------------------
  In order to get data from the database, our design philosophy revolves around 'templatized
sql' entered by the user for each entity. This gives the user the entire power of SQL if he
needs it. The root entity is the central table whose columns can be used to join this table
with other child entities.
  
  === Schema for the xml config ===
-   The dataconfig does not have a rigid schema. The attributes in the entity/field are aribitrary
and depends on the `processor` and `transformer`. For !JdbcdataSource the entity attributes
are 
+   The dataconfig does not have a rigid schema. The attributes in the entity/field are arbitrary
and depends on the `processor` and `transformer`. For !JdbcdataSource the entity attributes
are 
- 
+ The default attributes for an entity
   * '''`name`''' (required) : A unique name used to identify an entity
-  * '''`query`''' (required) : The sql string using which to query the db
+  * '''`processor`''' (required) : The value must be ''!XPathEntityProcessor''
   * '''`transformer`'''  : Transformers to be applied on this entity. (See the transformer
section)
   * '''`dataSource`''' : The name of a datasource as put in the solrconfig.xml .(USed if
there are multiple datasources) 
-  * '''`pk`''' : The primary key for the entity
+  * '''`pk`''' : The primary key for the entity. Only needed for the root entity. This will
be the id for the document
+  * '''`rootEntity`''' : By default the entities falling under the document are root entities.
If it is set to false , the entity directly falling under that entity will be treated as the
root entity (so on and so forth). For every row returned by the roor entity a document is
created in Solr
+ 
+ For !JdbcdataSource the entity attributes are :
+ 
+  * '''`query`''' (required) : The sql string using which to query the db
   * '''`deltaQuery`''' : Only used in delta-import
   * '''`parentDeltaQuery`''' : Only used in delta-import
   * '''`deletedPkQuery`''' : Only used in delta-import
-  * '''`rootEntity`''' : By default the entities falling under the document are root entities.
If it is set to false , the entity directly falling under that entity will be treated as the
root entity (so on and so forth). For every row returned by the roor entity a document is
created in Solr
  
  
  == Commands ==
@@ -275, +279 @@

   * For each row given by ''deltaQuery'', the parentDeltaQuery is executed.
   * If any row in the root/child entity changes, we regenerate the complete SOLR document
which contained that row.
  
- = Usage with XMl/HTTP Datasource =
+ = Usage with XML/HTTP Datasource =
  DataImportHandler can be used to index data from HTTP based data sources. This includes
using indexing from REST/XML APIs as well as from RSS/ATOM Feeds.
  
  == Configuration in solrconfig.xml ==
@@ -301, +305 @@

   * '''`connectionTimeout`''' (optional):The default value is 5000ms 
   * '''`readTimeout`''' (optional): the default value is 10000ms 
  == Configuration in data-config.xml ==
- The entity for an xml/http data source can have the following attributes
+ The entity for an xml/http data source can have the following attributes over and above
the default attributes
-  * '''`name`''' (required) : A unique name used to identify an entity
-  * '''`processor`''' (required) : The value must be ''!XPathEntityProcessor''
   * '''`url`''' (required) : The url used to invoke the REST API. (Can be templatized)
   * '''`forEach`'''(required) : The xpath expression which demarcates a record. If there
are mutiple types of record separate them with '' |  ''
+ 
-  * '''`transformer`'''  : Transformers to be applied on this entity. (See the transformer
section)
-  * '''`dataSource`''' : The name of a datasource as put in the solrconfig.xml .(USed if
there are multiple datasources) 
-  * '''`pk`''' : The primary key for the entity. Only needed for the root entity. This will
be the id for the document
  
   
   

Mime
View raw message