Hi Erick. sorry for the slow reply on this one. make sure you have correct icu4c version at the beginning of your PATH before running ant regenerate. it should match the icu4j version. it seems to me you have a mismatch.

On Wed, Dec 4, 2019, 2:32 PM Erick Erickson <erickerickson@gmail.com> wrote:
I have the git pull working for fetching a particular revision of nfkc.txt and the like. Now TestICUFoldingFilterFactory fails tests. Here's what I could find on that topic:

  public static final Normalizer2 NORMALIZER = Normalizer2.getInstance(
    // TODO: if the wrong version of the ICU jar is used, loading these data files may give a strange error.
    // maybe add an explicit check? http://icu-project.org/apiref/icu4j/com/ibm/icu/util/VersionInfo.html
    "utr30", Normalizer2.Mode.COMPOSE);
eventually calls:

 public Normalizer2Impl load(ByteBuffer bytes) {
    try {
      this.dataVersion = ICUBinary.readHeaderAndDataVersion(bytes, 1316121906, IS_ACCEPTABLE);
which throws
Caused by: com.ibm.icu.util.ICUUncheckedIOException: java.io.IOException: ICU data file error: Header authentication failed, please check if you have a valid ICU data file; data format 4e726d32, format version

0x4e726d32==1316121906, so the data format looks ok to my uninformed eye.

The jar file I have for icu is: icu4j-62.1.jar

I looked at the nfc* files that are now fetched from github and at least ./lucene/analysis/icu/src/data/utr30/nfc.txt is identical.

I’ll get back to this later this afternoon, meanwhile any pointers?
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org