lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject char mapping in lucene-icu
Date Sat, 15 Feb 2014 00:48:26 GMT


I try to use lucene-icu li in solr-4.6.1. I need to  change a char mapping in lucene-icu.
I have made changes


and built jar file using ant , but it did not help.

 I took a look to  lucene/analysis/icu/build.xml and see these lines

 <property name="gennorm2.src.files"
  	value="nfc.txt nfkc.txt nfkc_cf.txt BasicFoldings.txt DiacriticFolding.txt DingbatFolding.txt
HanRadicalFolding.txt NativeDigitFolding.txt"/>
  <property name="gennorm2.tmp" value="${build.dir}/gennorm2/utr30.tmp"/>
  <property name="gennorm2.dst" value="${resources.dir}/org/apache/lucene/analysis/icu/utr30.nrm"/>
  <target name="gennorm2" depends="gen-utr30-data-files">
    <echo>Note that the gennorm2 and icupkg tools must be on your PATH. These tools
are part of the ICU4C package. See </echo>
    <mkdir dir="${build.dir}/gennorm2"/>
    <exec executable="gennorm2" failonerror="true">
      <arg value="-v"/>
      <arg value="-s"/>
      <arg value="${}"/>
      <arg line="${gennorm2.src.files}"/>
      <arg value="-o"/>
      <arg value="${gennorm2.tmp}"/>
    <!-- now convert binary file to big-endian -->
    <exec executable="icupkg" failonerror="true">
      <arg value="-tb"/>
      <arg value="${gennorm2.tmp}"/>
      <arg value="${gennorm2.dst}"/>
    <delete file="${gennorm2.tmp}"/>

looks like ant does not execute gennorm2. If I build utr30.nrm file using gennorm2 manually
 and replacing utr30.nrm in the jar file then starting solr gives the following error.
Caused by: java.lang.RuntimeException: ICU data file error: Header authentication
failed, please check if you have a valid ICU data file

My questions are;
 1. if the above code in the build file does not get executed then how the utr30 file is generated?
 2. How to change a character mapping. 


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message