lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Binda <olivier.bi...@wanadoo.fr>
Subject Re: Finding a match for an automaton against a FST
Date Sat, 10 Jan 2015 13:23:04 GMT
On 01/10/2015 11:00 AM, Michael McCandless wrote:
> On Fri, Jan 9, 2015 at 6:42 AM, Olivier Binda <olivier.binda@wanadoo.fr> wrote:
>> Hello.
>>
>> 1) What is the best way to check if an automaton (from a regex or a string
>> with a wildcard)
>> has at least 1 match against a FST (from a WFSTCompletionLookup) ?
> You need to implement "intersect".  We already have this method for
> two automata (Operations.java); maybe you can start from that but
> cutover to the FST APIs instead for the 2nd automaton?

I looked a bit into this. This is complicated stuff :/

I think I get what the nested loops in intersect() do  : transitions 
consist of a double dimension array, somehow those arays are intersected.
I don't understand yet why is there a .min and a .max for a transition 
(why not just a codepoint ?)

Fst and automaton (and maybee the lucene codec stuff) are 3 different 
implementations
of finite state machine/transducers, right ?

How does regexQuery (automaton) match against an index ?
Does it use intersect() internally ?  (if it does, maybee I could reuse 
that code too)



>
>> 2) Also, is there a simple/efficient way to find the lowest and the highest
>> arcs of a FST that match against  an automaton ?
> Hmm arcs leaving which state?  The initial state?  You could simply
> walk all arcs leaving the initial state from the FST and check if the
> automaton accepts them leaving its initial state (assuming the
> automaton has no dead states)?
>
> Or, if you are already doing an intersection here, just save this
> information as a side effect since you will have already computed it.

thanks for the tips, it helps.
Olivier



> Mike McCandless
>
> http://blog.mikemccandless.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message