lucene-lucene-net-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prescott Nasser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENENET-486) Wildcard queries are not analyzed
Date Fri, 20 Apr 2012 21:31:32 GMT

    [ https://issues.apache.org/jira/browse/LUCENENET-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258600#comment-13258600
] 

Prescott Nasser commented on LUCENENET-486:
-------------------------------------------

We could potentially add a developer flag in the future if they want to flip the behavior.
The behavior Bjorn is asking for has merit (as well as the current implementation).

Fix from Bjorn:

--- C:\Users\xyt\AppData\Local\Temp\AnkhSVN\3179\QueryParser.27919.cs 20.04.2012  08:55:36
+++ C:\_VS.NET\se\Lucene-2_9_4\core\QueryParser\QueryParser.cs 17.04.2012  12:25:02
	
 		/// <summary> Factory method for generating a query (similar to
 		/// <see cref="GetWildcardQuery" />). Called when parser parses an input term
 		/// token that uses prefix notation; that is, contains a single '*' wildcard
 		/// character as its last character. Since this is a special case
 		/// of generic wildcard term, and such a query can be optimized easily,
 		/// this usually results in a different query object.
 		/// <p/>
 		/// Depending on settings, a prefix term may be lower-cased
 		/// automatically. It will not go through the default Analyzer,
 		/// however, since normal Analyzers are unlikely to work properly
 		/// with wildcard templates.
 		/// <p/>
 		/// Can be overridden by extending classes, to provide custom handling for
 		/// wild card queries, which may be necessary due to missing analyzer calls.
 		/// 
 		/// </summary>
 		/// <param name="field">Name of the field query will use.
 		/// </param>
 		/// <param name="termStr">Term token to use for building term for the query
 		/// (<b>without</b> trailing '*' character!)
 		/// 
 		/// </param>
 		/// <returns> Resulting <see cref="Query" /> built for the term
 		/// </returns>
 		/// <exception cref="ParseException">throw in overridden method to disallow
 		/// </exception>
 		public /*protected internal*/ virtual Query GetPrefixQuery(System.String field, System.String
termStr)
 		{
 			if (!allowLeadingWildcard && termStr.StartsWith("*"))
 				throw new ParseException("'*' not allowed as first character in PrefixQuery");
 			if (lowercaseExpandedTerms)
 			{
 				termStr = termStr.ToLower();
 			}
-			Term t = new Term(field, termStr);
+         Term t = null;
+         TermQuery q = null;
+         try
+         {
+            q = GetFieldQuery(field, termStr) as TermQuery;
+         }
+         catch(Exception ex)
+         {
+         }
 
+         if (q != null)
+         {
+            t = new Term(field, q.GetTerm().text);
+         }
+         else
+         {
+            t = new Term(field, termStr);
+         }
 			return NewPrefixQuery(t);
 		}
                
> Wildcard queries are not analyzed
> ---------------------------------
>
>                 Key: LUCENENET-486
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-486
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Contrib, Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4
>         Environment: Windows 7, Visual Studio 2010, .net 4.0
>            Reporter: Björn
>         Attachments: LuceneTest.zip
>
>
> The lucene 'QueryParser' doesn't analyze wildcard querys. The function 'GetPrefixQuery'(QueryParser.cs)
returns the string without any analyzation.
> I have performed some queries to show the problem. The analyzer is the 'Contrib.Analyzers.DE.GermanAnalyzer'
> ---------- indexed word: 'Häuser'; in the index stemmed as: 'hau' ----------
> query: Hau*; hit: yes
> query: Hause*; hit: no; This should be a hit.....
> ---------- indexed word: 'Angebote'; in the index stemmed as: 'angebo' ----------
> query: Angebo*; hit: yes
> query: Angebot*; hit: no; This should be a hit.....
> query: Angebote*; hit: no; This should be a hit.....
> ---------- indexed word: 'Björn'; in the index stemmed as: 'bjor' ----------
> query: Bjor*; hit: yes
> query: Björ*; hit: no; This should be a hit.....

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message