lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1670) synonymfilter/map repeat bug
Date Sun, 20 Dec 2009 10:24:18 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792980#action_12792980
] 

Robert Muir commented on SOLR-1670:
-----------------------------------

bq. Robert, can you tell if the flaw is in the test code or the SynonymFilter? I'm pretty
sure that such a bug (in the filter) wasn't originally there... so my money would be on the
test - perhaps getTokList, but I haven't had a chance to look yet.

I am pretty sure there is a problem. the rewritten test in SOLR-1674 looks like: 

{code}
assertTokenizesTo(map, "a b", new String[] { "ab", "ab", "ab"  });
{code}

where assertTokenizesTo is just

{code}
static void assertTokenizesTo(SynonymMap dict, String input,
      String expected[]) throws IOException {
    Tokenizer tokenizer = new WhitespaceTokenizer(new StringReader(input));
    SynonymFilter stream = new SynonymFilter(tokenizer, dict);
    assertTokenStreamContents(stream, expected);
  }
{code}

and assertTokenStreamContents is the one copied from lucene.


> synonymfilter/map repeat bug
> ----------------------------
>
>                 Key: SOLR-1670
>                 URL: https://issues.apache.org/jira/browse/SOLR-1670
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>         Attachments: SOLR-1670_test.patch
>
>
> as part of converting tests for SOLR-1657, I ran into a problem with synonymfilter
> the test for 'repeats' has a flaw, it uses this assertTokEqual construct which does not
really validate that two lists of token are equal, it just stops at the shorted one.
> {code}
>     // repeats
>     map.add(strings("a b"), tokens("ab"), orig, merge);
>     map.add(strings("a b"), tokens("ab"), orig, merge);
>     assertTokEqual(getTokList(map,"a b",false), tokens("ab"));
>     /* in reality the result from getTokList is ab ab ab!!!!! */
> {code}
> when converted to assertTokenStreamContents this problem surfaced. attached is an additional
assertion to the existing testcase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message