lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Biswas, Goutam_Kumar" <Goutam-Kumar-Bis...@deshaw.com>
Subject RE: prefixquery not working on migrating to Lucene 1.3
Date Mon, 28 Apr 2003 05:35:04 GMT
Otis,
I am using the following Analyzer. Can you please point me as to where I
need to change so that '\' characters are not thrown off. Also I thought
that wild card query terms that end with a * (like path:/u/biswasg/demo\
Docs*) do not pass through the analyzer. Am I correct ?

<---------------------------------snip--------------------------------->
import org.apache.lucene.analysis.standard.StandardTokenizer;

/**
 * Personalized Analyser to be used by Lucene to analyze the text in both 
 * indexing and searching.
 *
 * @author Velayudham Radhakrishnan
 * @version $Id: MyAnalyzer.java,v 1.4 2003/01/30 12:09:25 dantam Exp $
 */
public class MyAnalyzer  extends Analyzer
{   
    /**
     * Default no-arg Constructor
     */
    public MyAnalyzer()
    {
	this.stopWords = STOP_WORDS;
	this.stopTable = StopFilter.makeStopTable(stopWords); 
    }
    
    /*
     * Constuctor with 1 arg.
     * 
     * @param stopWords an array to stop words.
     */
    public MyAnalyzer(String[] stopWords)
    {
	this.stopWords = stopWords;
	this.stopTable = StopFilter.makeStopTable(stopWords); 
    }
    
    /*
     * Create a token stream for this analyzer.
     *
     * @param reader Reader from which data is read.
     */
    public final TokenStream tokenStream(final Reader reader)
    {
	TokenStream result = new StandardTokenizer(reader);
	
	result = new StandardFilter(result);
	result = new LowerCaseFilter(result);
	result = new StopFilter(result, stopTable);
	result = new PorterStemFilter(result);
	
	return result;
    }
    
    // An array containing some common words that are not usually useful for

    //searching.
    private static String[] stopWords;
    
    // Stop table.
    private static Hashtable stopTable;

    // Stop Words.
    private static final String[] STOP_WORDS = {
	"a"       , "and"     , "are"     , "as"      ,
	"at"      , "be"      , "but"     , "by"      ,
	"for"     , "if"      , "in"      , "into"    ,
	"is"      , "it"      , "no"      , "not"     ,
	"of"      , "on"      , "or"      , "s"       ,
	"such"    , "t"       , "that"    , "the"     ,
	"their"   , "then"    , "there"   , "these"   ,
	"they"    , "this"    , "to"      , "was"     ,
	"will"    ,
	"with"
    };    
}

<-----------------------------/snip-----------------------------------------
------->

Thanks,
  Goutam

-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
Sent: Monday, April 28, 2003 1:23 AM
To: Lucene Users List
Subject: RE: prefixquery not working on migrating to Lucene 1.3


This ought to get entered in the FAQ at jGuru...
You need to use an Analyzer that does not throw away characters like
'\'.

Otis


--- "Biswas, Goutam_Kumar" <Goutam-Kumar-Biswas@deshaw.com> wrote:
> Otis,
> 
> Your suggestion worked. Thanks. However there is one more problem. If
> the
> path contains a '-' I do not get the results, even if I escape the
> '-'. For
> example: path:/u/biswasg/Install/jakarta\-tomcat*. If I search for
> path:/u/biswasg/Install/jakarta*, however, I get the correct results.
> So I
> figure out that the '-' causing the problem here. How do I deal with
> these
> cases ?
> 
> Thanks always,
> Goutam
> 
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Sunday, April 27, 2003 7:06 PM
> To: Lucene Users List
> Subject: Re: prefixquery not working on migrating to Lucene 1.3
> 
> 
> I think that may be due to one of the new QueryParser methods.
> setWildcardLowercase(boolean), I think.  Check the source for the
> exact
> method name.
> 
> Otis
> 
> 
> --- "Biswas, Goutam_Kumar" <Goutam-Kumar-Biswas@deshaw.com> wrote:
> > Hi,
> > 
> > I have been using queries like: filename:(txt) AND
> > path:(/u/biswasg/Install*) with Lucene 1.2 which gave me correct
> > results. I
> > moved to Lucene 1.3 a while ago and find that these queries no
> longer
> > work.
> > The Lucene Query is: +txt +path:/u/biswasg/install*. I observe that
> > the path
> > has been lowercased (which did not happen when I was using 1.2).
> > 
> > I made the following changes in my code when I moved over to 1.3.
> > 	
> >         QueryParser qp = new QueryParser(defaultSearchField, new
> > MyAnalyzer());
> >         qp.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
> > 
> > How can I prevent Lucene from lower casing query terms that ends
> with
> > a *. I
> > must mention that my objective here is to restrict my search
> results
> > to
> > those files that begin with a spceified prefix.
> > 
> > Any help on this is appreciated.
> > 
> > Thanks,
> > -Goutam
> > 
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > 
> 
> 
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message