lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Smiley, David W." <dsmi...@mitre.org>
Subject FST bug?
Date Tue, 17 Jul 2012 14:44:00 GMT
I am building an FST.  Here is an excerpt from my code;
    //build the FST from the workingSet
    Builder<IntsRef> builder = new Builder<IntsRef>(FST.INPUT_TYPE.BYTE4, outputs);
    IntsRef sortedKeys[] = workingSet.keySet().toArray(new IntsRef[workingSet.size()]);
    Arrays.sort(sortedKeys);

    int maxPhraseLen = 0;
    int maxDocsLen = 0;
    for (IntsRef termIdsPhrase : sortedKeys) {
      IntsRef solrIds = workingSet.remove(termIdsPhrase);//remove to save memory
      assert termIdsPhrase.length > 0 && solrIds.length > 0;
      builder.add(termIdsPhrase, solrIds);
    }

    return builder.finish();

For what it's worth, the input side is maximum 7 integers long, and the output side is typically
the same but there are a small number that get as high as 48K integers long.  There are 10M
entries.

After many calls to builder.add(), and with assertions enabled, I eventually this exception:

Exception in thread "main" java.lang.AssertionError: size must be positive (got -262796219):
likely integer overflow?
	at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:336)
	at org.apache.lucene.util.fst.FST.addNode(FST.java:672)
	at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:122)
	at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:195)
	at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:287)
	at org.apache.lucene.util.fst.Builder.add(Builder.java:392)
	at org.mitre.opensextant.solr.TaggerFstCorpus.buildPhrases(TaggerFstCorpus.java:176)
	at org.mitre.opensextant.solr.TaggerFstCorpus.doBuild(TaggerFstCorpus.java:61)
	at org.mitre.opensextant.solr.BuildCorpusExperiment.main(BuildCorpusExperiment.java:31)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)


This is on Lucene 4.0-ALPHA using JDK 7.  I'm using 6GB of heap; my attempts to use less resulted
in Out-of-memory errors.  What FST size limitation am I bumping up against?

~ David
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message