Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 79010 invoked from network); 29 Sep 2010 23:47:04 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Sep 2010 23:47:04 -0000 Received: (qmail 59775 invoked by uid 500); 29 Sep 2010 23:47:01 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 59149 invoked by uid 500); 29 Sep 2010 23:47:01 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 58952 invoked by uid 99); 29 Sep 2010 23:47:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Sep 2010 23:47:00 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Sep 2010 23:46:58 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o8TNkbJ5013985 for ; Wed, 29 Sep 2010 23:46:37 GMT Message-ID: <25062128.469781285803997217.JavaMail.jira@thor> Date: Wed, 29 Sep 2010 19:46:37 -0400 (EDT) From: "Hoss Man (JIRA)" To: dev@lucene.apache.org Subject: [jira] Commented: (SOLR-2093) regular expression in PatternReplaceFilter can handle: /([^/]*) In-Reply-To: <23305935.30441282912673883.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/SOLR-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916309#action_12916309 ] Hoss Man commented on SOLR-2093: -------------------------------- Note: Part of your confusion may lie in the meaning behind {{replace="all"}} ... this doesn't mean replace the entire Token, this means replace all matches of the regex with the replacement value -- so the pattern will be evaluated over and over against the input string (starting at the end of the last match) until it no longer matches, and each match will result in a replacement. If you want the entire input Token to be replaced by the parenthetical group, you need to anchor your regex at both ends. This should work.. {noformat} {noformat} > regular expression in PatternReplaceFilter can handle: /([^/]*) > --------------------------------------------------------------- > > Key: SOLR-2093 > URL: https://issues.apache.org/jira/browse/SOLR-2093 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis > Affects Versions: 1.4 > Environment: debian,JRE1.6,solr1.4 > Reporter: Kuri Masta > Priority: Minor > Original Estimate: 2h > Remaining Estimate: 2h > > Using PatternReplaceFilter i want to extract a certain word out of the URI. > Although I now understand that I should handle this outside of Solr, the fact remains that Solr does not adequately handle regular expressions. > Viewing the source code, I don't see any problems since it uses the java library. > The problem: > > > pattern="/([^/]*)/[^/]*$" replacement="$1" replace="all" /> > > Input text: > - a/b/c > Expected > - b > Result Solr > - ab > An online JAVA regexp tester (http://www.regexplanet.com/simple/index.html): > - b > So the problem area lies at /([^/]) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org