lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5152) Lucene FST is not immutable
Date Thu, 08 Aug 2013 12:42:48 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733454#comment-13733454
] 

Michael McCandless commented on LUCENE-5152:
--------------------------------------------

bq. can you elaborate what you are concerned about?

I'm worried about the O(N^2) cost of the assert: for every arc (single
byte of each term in a seekExact) we are iterating over all root arcs
(up to 256 arcs) in this assert.

bq. findTargetArc is the only place where we actually use this cache?

Ahh that's true, I hadn't realized that.

Maybe, instead, we can move the assert just inside the if that
actually uses the cached arcs?  Ie, put it here:

{code}
  if (follow.target == startNode && labelToMatch < cachedRootArcs.length) {
    assert assertRootArcs();
    ...
  }
{code}

This would address my concern: the cost becomes O(N) not O(N^2).  And
the coverage is the same?

                
> Lucene FST is not immutable
> ---------------------------
>
>                 Key: LUCENE-5152
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5152
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/FSTs
>    Affects Versions: 4.4
>            Reporter: Simon Willnauer
>            Priority: Blocker
>             Fix For: 5.0, 4.5
>
>         Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch
>
>
> a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output
from and FST (BytesRef) which caused sideffects in later execution. 
> I added an assertion into the FST that checks if a cached root arc is modified and in-fact
this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need
to think about how to make this less trappy since it can cause bugs that are super hard to
find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message