lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-2093) regular expression in PatternReplaceFilter can handle: /([^/]*)
Date Wed, 29 Sep 2010 23:39:32 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916307#action_12916307
] 

Hoss Man commented on SOLR-2093:
--------------------------------

bq. I would like you to try again. But please keep in mind that I don't need this fix, I just
found a bug and am reporting it.

I see no bug here.

 As Koji described, even using the online regex tool you provided you can see these exact
results.

Input...
{noformat}
Regular Expression: /([^/]*)/[^/]*$
Replacement: $1
Test String #1: a/b/c
{noformat}

Output...
{noformat}
...
replaceAll(): ab
...
group(0): /b/c
group(1): b
{noformat}

bq. I wouldn't even know how to write regexp so it will concatenate two seperate matches

I don't think you understand the regex you provided.  I don't believe there are two matches,
I believe there is one match (refered to in your online tool as "group(0)"), and that entire
match is replaced by the first parenthetical group (refered to in your online tool as "group(1)").




> regular expression in PatternReplaceFilter can handle: /([^/]*)
> ---------------------------------------------------------------
>
>                 Key: SOLR-2093
>                 URL: https://issues.apache.org/jira/browse/SOLR-2093
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>         Environment: debian,JRE1.6,solr1.4
>            Reporter: Kuri Masta
>            Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Using PatternReplaceFilter i want to extract a certain word out of the URI.
> Although I now understand that I should handle this outside of Solr, the fact remains
that Solr does not adequately handle regular expressions.
> Viewing the source code, I don't see any problems since it uses the java library.
> The problem:
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.PatternReplaceFilterFactory"
>                         pattern="/([^/]*)/[^/]*$" replacement="$1"  replace="all" />
>       </analyzer>
> Input text:
> - a/b/c
> Expected
> - b
> Result Solr
> - ab
> An online JAVA regexp tester (http://www.regexplanet.com/simple/index.html):
> - b
> So the problem area lies at /([^/])

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message