lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Updated: (LUCENE-2970) SpecialOperations.isFinite can have TERRIBLE TERRIBLE runtime in certain situations
Date Wed, 16 Mar 2011 13:07:30 GMT


Robert Muir updated LUCENE-2970:

    Attachment: LUCENE-2970.patch

Attached is a patch: imagine a regexp with lots of "optionals" e.g. [abcd]e?f?[gh]a?b? ...

In this case the code is not linear in number of states... if we are at state A, and it has
a transition to B, we determine that B is finite, then later if we are at C and it leads to
B too, we need not determine if B is finite again, as we already did so. So, I keep "visited"
for this.

Additionally I changed it to use a Bitset instead of a HashSet, which helps the speed (but
just a constant-time speedup).

I took the old code, dumped it into AutomatonTestUtil as "isFiniteSimple" and the test just
generates random automata and compares this versus the new implementation.

I'd appreciate any reviews/suggestions any automaton-hackers want to give here.

> SpecialOperations.isFinite can have TERRIBLE TERRIBLE runtime in certain situations
> -----------------------------------------------------------------------------------
>                 Key: LUCENE-2970
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>         Attachments: LUCENE-2970.patch
> in an application of mine, i experienced some very slow query times with finite automata
(all the DFAs are acyclic)
> It turned out, the slowdown is some terrible runtime in SpecialOperations.isFinite <--
this is used to determine if the DFA is acyclic or not.
> (in this case I am talking about even up to minutes of cpu).

This message is automatically generated by JIRA.
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message