lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Solr Wiki] Trivial Update of "DataImportHandler" by OkkeKlein
Date Fri, 11 Nov 2011 12:36:36 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "DataImportHandler" page has been changed by OkkeKlein:

  <<Anchor(commands)>> The handler exposes all its API as http requests . The
following are the possible operations
   * '''full-import''' : Full Import operation can be started by hitting the URL `http://<host>:<port>/solr/dataimport?command=full-import`
    * This operation will be started in a new thread and the ''status'' attribute in the response
should be shown ''busy'' now.
    * The operation may take some time depending on size of dataset.
    * When full-import command is executed, it stores the start time of the operation in a
file located at ''conf/''
@@ -149, +148 @@

     * '''commit''' : (default 'true'). Tells whether to commit after the operation.
     * '''optimize''' : (default 'true'). Tells whether to optimize after the operation.
     * '''debug''' : (default 'false'). Runs in debug mode. It is used by the interactive
development mode ([[#interactive|see here]]).
      * Please note that in debug mode, documents are never committed automatically. If you
want to run debug mode and commit the results too, add 'commit=true' as a request parameter.
   * '''delta-import''' : For incremental imports and change detection run the command `http://<host>:<port>/solr/dataimport?command=delta-import`
. It supports the same clean, commit, optimize and debug parameters as full-import command.
   * '''status''' : To know the status of the current command, hit the URL `http://<host>:<port>/solr/dataimport`
. It gives an elaborate statistics on no. of docs created, deleted, queries run, rows fetched,
status etc.
@@ -330, +328 @@

   . {{{
   deltaQuery="SELECT MAX(did) FROM ${dataimporter.request.dataView}"
-       Changed to:
+   . Changed to:
   deltaQuery="SELECT MAX(did) AS did FROM ${dataimporter.request.dataView}"
@@ -605, +603 @@

   * '''`dateTimeFormat`''' : The format used for parsing this field. This must comply with
the syntax of java [[|SimpleDateFormat]].
   * '''`sourceColName`''' : The column on which the dateFormat is to be applied. If this
is absent source and target are same
-  * '''`locale`''' : the locale to use for date transformations. If no locale attribute is
specified, then the default one on the system is used.
+  * '''`locale`''' : The locale to use for date transformations. If no locale attribute is
specified, then the default one on the system is used.
  The above field definition is used in the RSS example to parse the publish date of the RSS
feed item.
@@ -1001, +999 @@

   * On doing a `command=full-import` The root-entity (A) is executed first
   * Each row that emitted by the 'query' in entity 'A' is fed into its sub entities B, C
   * The queries in B and C use a column in 'A' to construct their queries using placeholders
like `${A.a}`
    * B has a url  (B is an xml/http datasource)
    * C has a query
   * C has two transformers ('f' and 'g' )
@@ -1505, +1502 @@

  = Troubleshooting =
   * If you are having trouble indexing international characters, try setting the '''encoding'''
attribute to "UTF-8" on the dataSource element (example below). This should ensure that international
character data (stored in UTF8) ingested by the given source will be preserved.
    . {{{
     <dataSource type="FileDataSource" encoding="UTF-8"/>

View raw message