lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DataImportHandler" by JamesDyer
Date Mon, 19 Nov 2012 18:23:18 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "DataImportHandler" page has been changed by JamesDyer:
http://wiki.apache.org/solr/DataImportHandler?action=diff&rev1=326&rev2=327

Comment:
SOLR-4068: Refactor DIH - VariableResolver & Evaluator

  == What is a row? ==
  A row in !DataImportHandler is a Map (Map<String, Object>). In the map , the key is
the name of the field and the value can be anything which is a valid Solr type. The value
can also be a Collection of the valid Solr types (this may get mapped to a multi-valued field).
If the !DataSource is RDBMS a query cannot emit a multivalued field. But it is possible to
create a multivalued field by joining an entity with another.i.e if the sub-entity returns
multiple rows for one row from parent entity it can go into a multivalued field. If the datasource
is xml, it is possible to return a multivalued field.
  
- == A VariableResolver ==
+ == VariableResolver ==
- A !VariableResolver is the component which replaces all those placeholders such as `${<name>}`.
It is a multilevel Map.  Each namespace is a Map and namespaces are separated by periods (.)
. eg if there is a placeholder ${item.ID} , 'item' is a nampespace (which is a map) and 'ID'
is a value in that namespace. It is possible to nest namespaces like ${item.x.ID} where x
could be another Map. A reference to the current !VariableResolver can be obtained from the
Context. Or the object can be directly consumed by using ${<name>} in 'query' for RDBMS
queries or 'url' in Http .
+ The !VariableResolver is the component which replaces all those placeholders such as `${<name>}`.
It is a multilevel Map.  Each namespace is a Map and namespaces are separated by periods (.)
. eg if there is a placeholder ${item.ID} , 'item' is a nampespace (which is a map) and 'ID'
is a value in that namespace. It is possible to nest namespaces like ${item.x.ID} where x
could be another Map. A reference to the current !VariableResolver can be obtained from the
Context. Or the object can be directly consumed by using ${<name>} in 'query' for RDBMS
queries or 'url' in Http .
  
- === Custom formatting in query and url using Functions ===
+ == Evaluators - Custom formatting in queries and urls ==
- While the namespace concept is useful , the user may want to put some computed value into
the query or url for example there is a Date object and your datasource  accepts Date in some
custom format . There are a few functions provided by the !DataImportHandler which can do
some of these.
+ While the namespace concept is useful , the user may want to put some computed value into
the query or url for example there is a Date object and your datasource accepts Date in some
custom format.
  
-  * ''formatDate'' : It is used like this `'${dataimporter.functions.formatDate(item.ID,
'yyyy-MM-dd HH:mm')}'` . The first argument can be a valid value from the !VariableResolver
and the second value can be a format string (use !SimpleDateFormat) . The first argument can
be a computed value eg: `'${dataimporter.functions.formatDate('NOW-3DAYS', 'yyyy-MM-dd HH:mm')}'`
and it uses the syntax of the datemath parser in Solr. (note that it must enclosed in single
quotes) . <!> Note . This syntax has been changed in 1.4 . The second parameter was
not enclosed in single quotes earlier. But it will continue to work without single quote also.
+ === formatDate ===
+  Use this to format dates as strings.  It takes three parameters (prior to Solr 4.1, it
takes two):
+   1. A variable that refers to a date, or a datemath expression.
+   2. A date format string.  See java.text.SimpleDateFormat javadoc for valid date formats.
(Solr 4.1 and later, this must be enclosed in single quotes.  Solr 1.4 - 4.0, quotes are optional.
 Prior to Solr 1.4, this must not be enclosed in single quotes)
+   3. <!> [[Solr4.1]] (optional)  The locale code to use when formatting dates, enclosed
in single quotes. See java.util.Locale javadoc for details.  If omitted, this defaults to
the ROOT Locale. (Note: prior to Solr 4.1, formatDate would always use the current machine's
default locale.)
+ 
+ 
+  * example using a variable:  `'${dataimporter.functions.formatDate(item.ID, 'yyyy-MM-dd
HH:mm')}'`
+  * example using a datemmath expression:  `'${dataimporter.functions.formatDate('NOW-3DAYS',
'yyyy-MM-dd HH:mm')}'`
+  * example specifying a Locale: <!> [[Solr4.1]]  `'${dataimporter.functions.formatDate(item.ID,
'yyyy-MM-dd HH:mm', 'th_TH')}'`
+ 
+ === escapeSql ===
-  * ''escapeSql'' : Use this to escape special sql characters . eg : `'${dataimporter.functions.escapeSql(item.ID)}'`.
Takes only one argument and must be a valid value in the !VaraiableResolver.
+ Use this to escape special sql characters . eg : `'${dataimporter.functions.escapeSql(item.ID)}'`.
Takes only one argument and must be a valid value in the !VaraiableResolver.
+ 
+ === encodeUrl ===
-  * ''encodeUrl'' : Use this to encode urls . eg : `'${dataimporter.functions.encodeUrl(item.ID)}'`
. Takes only one argument and must be a valid value in the !VariableResolver
+ Use this to encode urls . eg : `'${dataimporter.functions.encodeUrl(item.ID)}'` . Takes
only one argument and must be a valid value in the !VariableResolver
  
- ==== Custom Functions ====
+ == Custom Evalutaors ==
  It is possible to plug in custom functions into DIH. Implement an [[http://lucene.apache.org/solr/api/org/apache/solr/handler/dataimport/Evaluator.html|Evaluator]]
and specify it in the data-config.xml . Following is an example of an evaluator which does
a 'toLowerCase' on a String.
  
  {{{
@@ -1099, +1112 @@

    </document>
  </dataConfig>
  }}}
- The implementation of !LowerCaseFunctionEvaluator
+ The implementation of !LowerCaseFunctionEvaluator 
  
+ <!> [[Solr4.1]] this example depends on API modifications made in Solr 4.1
  {{{
    public class LowerCaseFunctionEvaluator extends Evaluator{
      public String evaluate(String expression, Context context) {
-       List l = EvaluatorBag.parseParams(expression, context.getVariableResolver());
+       List<Object> l = parseParams(expression, context.getVariableResolver());
- 
        if (l.size() != 1) {
            throw new RuntimeException("'toLowerCase' must have only one parameter ");
        }
-       return l.get(0).toString().toLowerCase();
+       return l.get(0).toString().toLowerCase(Locale.ROOT);
- 
      }
- 
    }
  }}}
  === Accessing request parameters ===

Mime
View raw message