lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <dawid.we...@gmail.com>
Subject Re: Building FST-like automaton queries
Date Tue, 28 Feb 2012 15:40:41 GMT
> For steps 2 and 3 you shouldn't use FST at all.  Instead, for 2) use
> BasicAutomata.makeString(String) on each of your expanded terms, then
> BasicOperations.union on all of those automata to make a single

How many input strings do you have? The API Mike mentioned in from a
port of the Brics library -- making separate automatons and then an
union will result in an attempt to minimize the result and this (when
the set of input strings is large) is a no-no in terms of memory (my
own experience).

I've added a method that creates an optimized automaton from a union
of Strings in one step, but I see this hasn't been ported to Lucene
yet.

http://www.brics.dk/automaton/doc/dk/brics/automaton/BasicAutomata.html#makeStringUnion(java.lang.CharSequence...)

If you could provide a patch that would port that code to Lucene it'd
be great (I guess it's trivial) and would speed up your step (1)
greatly.

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message