lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3820) Wrong trailing index calculation in PatternReplaceCharFilter
Date Thu, 23 Feb 2012 07:21:49 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214412#comment-13214412
] 

Dawid Weiss commented on LUCENE-3820:
-------------------------------------

Thanks for looking at this, Robert. I'll go back to this later today, but I can tell you right
now that from my paper considerations negative indexes make logical sense in case of "prepended"
characters. So:

PATTERN: A
INPUT: ABCDEF
REPLACEMENT: XYZ
OUTPUT:XYZBCDEF

then (in my patch) X and Y would have negative offsets. It's a matter of agreement I guess.
Negative indexes are consistent with something like this:

PATTERN: ^
INPUT: ABC
REPLACEMENT: XYZ
OUTPUT:XYZABC

then all three characters (XYZ) have a negative index to indicate they're prepended. Thoughts?

                
> Wrong trailing index calculation in PatternReplaceCharFilter
> ------------------------------------------------------------
>
>                 Key: LUCENE-3820
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3820
>             Project: Lucene - Java
>          Issue Type: Bug
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: LUCENE-3820.patch, LUCENE-3820_test.patch, LUCENE-3820_test.patch
>
>
> I need to use PatternReplaceCharFilter's index corrections directly and it fails for
me -- the trailing index is not mapped correctly for a pattern "\\.[\\s]*" and replacement
".", input "A. .B.".
> I tried to understand the logic in getReplaceBlock but I eventually failed and simply
rewrote it from scratch. After my changes a few tests don't pass but I don't know if it's
the tests that are screwed up or my logic. In essence, the difference between the previous
implementation and my implementation is how indexes are mapped for shorter replacements. I
shift indexes of shorter regions to the "right" of the original index pool and the previous
patch seems to squeeze them to the left (don't know why though).
> If anybody remembers how it's supposed to work, feel free to correct me?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message