lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@googlemail.com>
Subject Re: new AutomatonQuery(RunAutomaton) ?
Date Wed, 31 Aug 2011 17:30:48 GMT
I do not think it will be expensive, it is just an attempt to keep
code smaller, simpler and marginally faster :)

those are a lot (Ca 1000) of small prefix based regex-es with limited
alphabet compiled as RunAutomaton I load on startup and lookup from
some RunAutomaton[] on request...

they look like Regex("((123)|(124)|(401)|(777)|(351))[0-9]{0,2}")

By the way, what will AutomatonQuery prefer "(XXX)[0-9]{0,2}" or
"(XXX)[0-9]*" or "(XXX).*" ? Any performance difference?

Semantically are they the same as I know that my content is only 5 digits

I need them to
1. formulate complex BooleanQuery, where AutomatonQuery gets one clause
2. do post processing (a lot of hits) of the "query against hits" and
this has to be fast.

I guess, I will switch to keeping only Automaton[] and build
RunAutomaton on the fly (per request) for fast query vs hits, this is
done once per request only, but them I need to keep state of the
RunAutomaton per query... makes things slightly more verbose...








On Wed, Aug 31, 2011 at 5:06 PM, Robert Muir <rcmuir@gmail.com> wrote:
> Can you provide more information about your automaton and why
> 'recompiling' it might be expensive?
>
> E.g. #states/#transitions, is it finite or infinite, etc.
>
> On Wed, Aug 31, 2011 at 10:56 AM, eks dev <eksdev@yahoo.co.uk> wrote:
>> Thanks Robert, this is what I expected after looking into CompiledAutomaton ..
>>
>> On Wed, Aug 31, 2011 at 2:00 PM, Robert Muir <rcmuir@gmail.com> wrote:
>>> On Wed, Aug 31, 2011 at 3:51 AM, eks dev <eksdev@yahoo.co.uk> wrote:
>>>> At the moment it is not possible (?) to construct AutomatonQuery with
>>>> RunAutomaton.
>>>> Would it make sense to add this possibility? Is it doable at all?
>>>
>>> Its not doable, we need more information than the runautomaton, its not enough.
>>>
>>> --
>>> lucidimagination.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
>
>
> --
> lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message