lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject RE: prefixquery not working on migrating to Lucene 1.3
Date Mon, 28 Apr 2003 14:53:09 GMT
Take a look at StandardFilter, I believe that's it.

Otis

--- "Biswas, Goutam_Kumar" <Goutam-Kumar-Biswas@deshaw.com> wrote:
> Otis,
> I am using the following Analyzer. Can you please point me as to
> where I
> need to change so that '\' characters are not thrown off. Also I
> thought
> that wild card query terms that end with a * (like
> path:/u/biswasg/demo\
> Docs*) do not pass through the analyzer. Am I correct ?
> 
>
<---------------------------------snip--------------------------------->
> import org.apache.lucene.analysis.standard.StandardTokenizer;
> 
> /**
>  * Personalized Analyser to be used by Lucene to analyze the text in
> both 
>  * indexing and searching.
>  *
>  * @author Velayudham Radhakrishnan
>  * @version $Id: MyAnalyzer.java,v 1.4 2003/01/30 12:09:25 dantam Exp
> $
>  */
> public class MyAnalyzer  extends Analyzer
> {   
>     /**
>      * Default no-arg Constructor
>      */
>     public MyAnalyzer()
>     {
> 	this.stopWords = STOP_WORDS;
> 	this.stopTable = StopFilter.makeStopTable(stopWords); 
>     }
>     
>     /*
>      * Constuctor with 1 arg.
>      * 
>      * @param stopWords an array to stop words.
>      */
>     public MyAnalyzer(String[] stopWords)
>     {
> 	this.stopWords = stopWords;
> 	this.stopTable = StopFilter.makeStopTable(stopWords); 
>     }
>     
>     /*
>      * Create a token stream for this analyzer.
>      *
>      * @param reader Reader from which data is read.
>      */
>     public final TokenStream tokenStream(final Reader reader)
>     {
> 	TokenStream result = new StandardTokenizer(reader);
> 	
> 	result = new StandardFilter(result);
> 	result = new LowerCaseFilter(result);
> 	result = new StopFilter(result, stopTable);
> 	result = new PorterStemFilter(result);
> 	
> 	return result;
>     }
>     
>     // An array containing some common words that are not usually
> useful for
> 
>     //searching.
>     private static String[] stopWords;
>     
>     // Stop table.
>     private static Hashtable stopTable;
> 
>     // Stop Words.
>     private static final String[] STOP_WORDS = {
> 	"a"       , "and"     , "are"     , "as"      ,
> 	"at"      , "be"      , "but"     , "by"      ,
> 	"for"     , "if"      , "in"      , "into"    ,
> 	"is"      , "it"      , "no"      , "not"     ,
> 	"of"      , "on"      , "or"      , "s"       ,
> 	"such"    , "t"       , "that"    , "the"     ,
> 	"their"   , "then"    , "there"   , "these"   ,
> 	"they"    , "this"    , "to"      , "was"     ,
> 	"will"    ,
> 	"with"
>     };    
> }
> 
>
<-----------------------------/snip-----------------------------------------
> ------->
> 
> Thanks,
>   Goutam
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Monday, April 28, 2003 1:23 AM
> To: Lucene Users List
> Subject: RE: prefixquery not working on migrating to Lucene 1.3
> 
> 
> This ought to get entered in the FAQ at jGuru...
> You need to use an Analyzer that does not throw away characters like
> '\'.
> 
> Otis
> 
> 
> --- "Biswas, Goutam_Kumar" <Goutam-Kumar-Biswas@deshaw.com> wrote:
> > Otis,
> > 
> > Your suggestion worked. Thanks. However there is one more problem.
> If
> > the
> > path contains a '-' I do not get the results, even if I escape the
> > '-'. For
> > example: path:/u/biswasg/Install/jakarta\-tomcat*. If I search for
> > path:/u/biswasg/Install/jakarta*, however, I get the correct
> results.
> > So I
> > figure out that the '-' causing the problem here. How do I deal
> with
> > these
> > cases ?
> > 
> > Thanks always,
> > Goutam
> > 
> > 
> > -----Original Message-----
> > From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> > Sent: Sunday, April 27, 2003 7:06 PM
> > To: Lucene Users List
> > Subject: Re: prefixquery not working on migrating to Lucene 1.3
> > 
> > 
> > I think that may be due to one of the new QueryParser methods.
> > setWildcardLowercase(boolean), I think.  Check the source for the
> > exact
> > method name.
> > 
> > Otis
> > 
> > 
> > --- "Biswas, Goutam_Kumar" <Goutam-Kumar-Biswas@deshaw.com> wrote:
> > > Hi,
> > > 
> > > I have been using queries like: filename:(txt) AND
> > > path:(/u/biswasg/Install*) with Lucene 1.2 which gave me correct
> > > results. I
> > > moved to Lucene 1.3 a while ago and find that these queries no
> > longer
> > > work.
> > > The Lucene Query is: +txt +path:/u/biswasg/install*. I observe
> that
> > > the path
> > > has been lowercased (which did not happen when I was using 1.2).
> > > 
> > > I made the following changes in my code when I moved over to 1.3.
> > > 	
> > >         QueryParser qp = new QueryParser(defaultSearchField, new
> > > MyAnalyzer());
> > >         qp.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
> > > 
> > > How can I prevent Lucene from lower casing query terms that ends
> > with
> > > a *. I
> > > must mention that my objective here is to restrict my search
> > results
> > > to
> > > those files that begin with a spceified prefix.
> > > 
> > > Any help on this is appreciated.
> > > 
> > > Thanks,
> > > -Goutam
> > > 
> > > 
> > >
> >
> ---------------------------------------------------------------------
> > > To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> > > 
> > 
> > 
> > __________________________________
> > Do you Yahoo!?
> > The New Yahoo! Search - Faster. Easier. Bingo.
> > http://search.yahoo.com
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > 
> 
> 
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message