lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5152) Lucene FST is not immutable
Date Sat, 03 Aug 2013 00:31:48 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728317#comment-13728317
] 

Michael McCandless commented on LUCENE-5152:
--------------------------------------------

bq. how is getBytesReader related to the root arcs?

Well, anything that works with the FST APIs needs to call
getBytesReader first, e.g. MemoryPF does this every time you pull a
TermsEnum from it.

bq. This is a very trappy thing and we should catch any violation IMO very quickly.

I agree it's trappy and it's great to add this check.

I'm simply proposing moving it to less of a hot-spot, and I don't
think this will affect how quickly we catch violations but should
reduce the cost of this added assertion.

In fact, I think findTargetArc isn't great in this regard; e.g. I
think MemoryPF only uses this API if the caller calls seekExact?  So I
think the current location of the assert is both more costly and lower
coverage than if we moved it to FST.getBytesReader.

                
> Lucene FST is not immutable
> ---------------------------
>
>                 Key: LUCENE-5152
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5152
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/FSTs
>    Affects Versions: 4.4
>            Reporter: Simon Willnauer
>            Priority: Blocker
>             Fix For: 5.0, 4.5
>
>         Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch
>
>
> a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output
from and FST (BytesRef) which caused sideffects in later execution. 
> I added an assertion into the FST that checks if a cached root arc is modified and in-fact
this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need
to think about how to make this less trappy since it can cause bugs that are super hard to
find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message