lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fergus McMenemie <fer...@twig.me.uk>
Subject DIH using values from solrconfig.xml inside data-config.xml
Date Mon, 02 Feb 2009 17:04:55 GMT
Hello

As per several postings I noted that I can define variables
inside an invariants list section of the DIH handler of
solrconfig.xml:-

  <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
       <str name="config">data-config.xml</str>
       </lst>
    <lst name="invariants">
       <str name="finstalldir">/Volumes/spare/ts</str>
       </lst>
    </requestHandler>  


I can also reference these variables within data-config.xml. This
works,  the solr field "test" is nicely populated. However how do
I use this variable within my regex transformer? Here is my 
data-config.xml:-

   <dataConfig>
   <dataSource name="myfilereader" type="FileDataSource"/>    
    <document>
       <entity name="jc"
	       processor="FileListEntityProcessor"
	       fileName="^.*\.xml$"
	       newerThan="'NOW-1000DAYS'"
	       recursive="true"
	       rootEntity="false"
	       dataSource="null"
	       baseDir="/Volumes/spare/ts/fords/dtd/fordsxml/data">
	  <entity name="x"
	          dataSource="myfilereader"
		  processor="XPathEntityProcessor"
		  url="${jc.fileAbsolutePath}"
		  stream="false"
		  forEach="/record"
		  transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,HTMLStripTransformer">

   <field column="fileAbsolutePath" template="${jc.fileAbsolutePath}" />
   <field column="fileWebPath"      regex="${dataimporter.request.finstalldir}(.*)" replaceWith="$1"
sourceColName="fileAbsolutePath"/>
   <field column="test"             template="${dataimporter.request.finstalldir}" />
   <field column="title"            xpath="/record/title" />
   <field column="para"             xpath="/record/sect1/para" stripHTML="true" />
   <field column="date"             xpath="/record/metadata/date[@qualifier='Date']" dateTimeFormat="yyyyMMdd"
  />
   	     </entity>
       </entity>
       </document>
    </dataConfig>

indexing my content I get an error as follows:-


INFO: SolrDeletionPolicy.onInit: commits:num=2
	commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_7,version=1233583868834,generation=7,filenames=[_7.frq,
_4.fdt, _7.tii, _7.fnm, _4.fdx, _7.tis, segments_7, _7.nrm, _7.prx]
	commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_8,version=1233583868835,generation=8,filenames=[segments_8]
Feb 2, 2009 5:00:50 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: last commit = 1233583868835
Feb 2, 2009 5:00:57 PM org.apache.solr.handler.dataimport.EntityProcessorBase applyTransformer
WARNING: transformer threw error
java.util.regex.PatternSyntaxException: Illegal repetition near index 0
${dataimporter.request.finstalldir}(.*)
^
	at java.util.regex.Pattern.error(Pattern.java:1650)
	at java.util.regex.Pattern.closure(Pattern.java:2706)
	at java.util.regex.Pattern.sequence(Pattern.java:1798)
	at java.util.regex.Pattern.expr(Pattern.java:1687)
	at java.util.regex.Pattern.compile(Pattern.java:1397)
	at java.util.regex.Pattern.<init>(Pattern.java:1124)
	at java.util.regex.Pattern.compile(Pattern.java:817)
	at org.apache.solr.handler.dataimport.RegexTransformer.getPattern(RegexTransformer.java:129)
	at org.apache.solr.handler.dataimport.RegexTransformer.process(RegexTransformer.java:88)
	at org.apache.solr.handler.dataimport.RegexTransformer.transformRow(RegexTransformer.java:74)
	at org.apache.solr.handler.dataimport.RegexTransformer.transformRow(RegexTransformer.java:42)
	at org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187)
	at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197)
	at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
	at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:333)
	at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:359)
	at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:222)
	at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:155)
	at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:324)
	at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:384)
	at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:365)


Is there some simple escape or other syntax to be used or is
this an enhancement?

Regards Fergus.
-- 

===============================================================
Fergus McMenemie               Email:fergus@twig.me.uk
Techmore Ltd                   Phone:(UK) 07721 376021

Unix/Mac/Intranets             Analyst Programmer
===============================================================

Mime
View raw message