lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DataImportHandler" by ShalinMangar
Date Mon, 31 Mar 2008 11:55:05 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by ShalinMangar:
http://wiki.apache.org/solr/DataImportHandler

The comment on the change is:
Added detail on script transformer

------------------------------------------------------------------------------
   * '''`replaceWith`''' : Used alongwith `regex` . It is equivalent to the method `new String(<sourceColVal>).replaceAll(<regex>,
<replaceWith>)`
  Here the attributes 'regex' and 'sourceColName' are custom attributes used by the transformer.
It reads the field 'full_name' from the resultset and transform it to two target fields 'firstName'
and 'lastName' . So even though the query returned only one column 'full_name' in the resultset
the solr document gets two extra fields 'firstName' and 'lastName' wich are 'derived' fields.
  
- The 'emailids' field in the table can be a comma separated value. So it ends up giving out
one or more than one email ids and we expect the 'mailId' to be a multivalued field in Solr
+ The 'emailids' field in the table can be a comma separated value. So it ends up giving out
one or more than one email ids and we expect the 'mailId' to be a multivalued field in Solr.
+ 
+ === ScriptTransformer ===
+ It is possible to write transformers in Javascript or any other scripting language supported
by Java. You must use '''Java 6''' to use this feature.
+ 
+ {{{
+ <dataConfig>
+ 	<script>
+ 		<![CDATA[
+ 		function f1(row)	{
+ 		    row.put('message', 'Hello World!');
+ 		    return row;
+ 		}
+ 		]]>
+ 	</script>
+ 	<document>
+ 		<entity name="e" pk="id" transformer="script:f1" query="select * from X">
+                 ....
+                 </entity>
+         </document>
+ </dataConfig>
+ }}}
+ 
+  * You can put script tags inside the ''dataConfig'' node. By default, the language is assumed
to be Javascript. In case you're using another language, specify on the script tag with attribute
''language="MyLanguage"''
+  * Write as many transformer functions as you want to use. Each such function must accept
a ''row'' variable corresponding to ''Map<String, Object>'' and return a row (after
applying transformations)
+  * Make an entity use a function by specifying ''transformer="script:<function-name>"''
in the ''entity'' node.
+  * In the above data-config, the javascript function ''f1'' will be executed once for each
row returned by entity e.
  
  == EntityProcessor ==
  Each entity is handled by a default Entity processor called !SqlEntityProcessor. This works
well for systems which use RDBMS as a datasource. For other kind of datasources like  REST
or Non Sql datasources you can choose to implement this interface `org.apache.solr.handler.dataimport.Entityprocessor`

Mime
View raw message