lucene-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (Jira)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8596) The replacement of comments is a bug, in "UserDictionary.java"
Date Tue, 03 Dec 2019 19:03:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987195#comment-16987195
] 

Erick Erickson commented on LUCENE-8596:
----------------------------------------

Why do you think it's a bug? What behavior do you see that is related to this?

Here's the next few lines.
 
{code:java}
     line = line.replaceAll("#.*$", "");
     // Skip empty lines or comment lines
     if (line.trim().length() == 0) {
       continue;
     }
{code}

The current code is going to replace everything from the hash through the end of line with
an empty string. If this is a line like:
{code}
# some comment
{code}
the if clause does the same thing whether a space is substituted or a null string because
line.trim().length() will be equal to zero.

If it's a line like this:
{code}
   something, somethingelse#more stuff
{code}
There'd be a space after "somethingelse" with the change, do you have evidence that causes
a problem?

If so, a test case demonstrating the incorrect behavior would be very valuable, 'cause just
looking at the code nothing jumps out.



> The replacement of comments is a bug, in "UserDictionary.java"
> --------------------------------------------------------------
>
>                 Key: LUCENE-8596
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8596
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: miyaharas
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/dict/UserDictionary.java#L68]
>  
> hi
> I think that this is bug.
> I think the following is correct
> {code:java}
> line = line.replaceAll ("^ #. * $", "");  
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Mime
View raw message