lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DataImportHandler" by NoblePaul
Date Thu, 13 Nov 2008 04:44:48 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by NoblePaul:
http://wiki.apache.org/solr/DataImportHandler

The comment on the change is:
onError, FieldreaderDataSource

------------------------------------------------------------------------------
   * '''`dataSource`''' : The name of a datasource as put in the the datasource .(Used if
there are multiple datasources) 
   * '''`pk`''' : The primary key for the entity. It is '''optional''' and only needed when
using delta-imports. It has no relation to the uniqueKey defined in schema.xml but they both
can be the same.
   * '''`rootEntity`''' : By default the entities falling under the document are root entities.
If it is set to false , the entity directly falling under that entity will be treated as the
root entity (so on and so forth). For every row returned by the root entity a document is
created in Solr
+  * '''`onError`''' : (abort|skip|continue) . The default value is 'abort' . 'skip' skips
the current document. 'continue' continues as if the error did not happen . <!> ["Solr1.4"]
  
  For !SqlEntityProcessor the entity attributes are :
  
@@ -687, +688 @@

  
  == DataSource ==
  [[Anchor(datasource)]]
+ A class can extend `org.apache.solr.handler.dataimport.DataSource` . [http://svn.apache.org/viewvc/lucene/solr/trunk/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/DataSource.java?view=markup
See source] 
- A class can extend `org.apache.solr.handler.dataimport.DataSource` 
- {{{
- public abstract class DataSource<T> {
  
-   /**
-    * Initializes the DataSource with the <code>Context</code> and
-    * initialization properties.
-    * <p/>
-    * This is invoked by the <code>DataImporter</code> after creating an
-    * instance of this class.
-    *
-    * @param context
-    * @param initProps
-    */
-   public abstract void init(Context context, Properties initProps);
- 
-   /**
-    * Get records for the given query.The return type depends on the
-    * implementation .
-    *
-    * @param query The query string. It can be a SQL for JdbcDataSource or a URL
-    *              for HttpDataSource or a file location for FileDataSource or a custom
-    *              format for your own custom DataSource.
-    * @return Depends on the implementation. For instance JdbcDataSource returns
-    *         an Iterator<Map <String,Object>>
-    */
-   public abstract T getData(String query);
- 
-   /**
-    * Cleans up resources of this DataSource after use.
-    */
-   public abstract void close();
- }
- }}}
  and can be used as a !DataSource. It must be configured in the dataSource definition
  {{{
  <dataSource type="com.foo.FooDataSource" prop1="hello"/>
@@ -748, +717 @@

  The attributes are:
   * '''`basePath`''': (optional) The base path relative to which the value is evaluated
   * '''`encoding`''': (optional) If the files are to be read in an encoding that is not same
as the platform encoding
+ 
+ === FieldReaderDataSource ===
+ <!> ["Solr1.4"]
+ 
+ This can be used like an !HttpDataSource . The signature is as follows
+ {{{
+ public class FieldReaderDataSource extends DataSource<Reader>  
+ }}}
+ This can be useful for users who has a DB field containing xml and wish to use a nested
X!PathEntityProcessor
+ The datasouce may be configured as follows
+ {{{
+   <datasource name="f" type="FieldReaderDataSource" />
+ }}}
+ 
+ The enity which uses this datasource must keep the url value as the variable name dataField="field-name".
For instance , if the parent entity 'dbEntity' has a field called 'xmlData' . Then he child
entity woould look like,
+ {{{
+ <entity dataSource="f" processor="XPathEntityProcessor" dataField="dbEntity.xmlData"/>
+ }}}
+ 
  
  == Boosting , Skipping documents ==
  It is possible to decide in the runtime to skip or boost a particular document.

Mime
View raw message