lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Wild card and multiple keyword search
Date Wed, 13 Jul 2005 12:48:46 GMT
Sounds to me that all you need is to AND rather than OR your search terms.

	QueryParser qp = new QueryParser("keywords", analyzer);
	qp.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
        Query q = qp.parse(words);

where analyzer is just the standard one.

Or search for +MAIN +BOARD.  Or MAIN AND BOARD.


--
Ian.


On 13 Jul 2005 12:18:44 -0000, Rahul D Thakare
<rahul_thakare@rediffmail.com> wrote:
>  
> Hi,
> 
>  We are using doc.add(Field.Text("keywords",keywords)); to add the keywords to the document,
where keywords is comma separated keywords string.
> Lucene seems to tokenize the keywords with multiple words like(MAIN BOARD) as different
keywords(ie as MAIN and BOARD). Tokenization is based on comma and space...So if we search
for "MAIN BOARD", documents having keywords like "MAIN LOGIC", "MAIN PARTS", etc also show
up
> 
> If one searches for "MAIN BOARD", we want get only the documents have "MAIN BOARD". 
How to do this ?
> 
> To achieve this we used doc.add(Field.Keyword("keywords", keywords)); and while searching
> we cannot use standard analyzer, while searching, as divides the keywords if we search
keywords having space... so we wrote an KeywordAnalyser(KeywordAnalyzer is basically returns
only one single token) as given below.
> 
> /**
>  * Tokenizes the entire stream as single token
>  */
> 
>  public class KeywordAnalyzer extends Analyzer
>  {
>          public TokenStream tokenStream(String fieldName, final Reader reader)
>          {
>                  return new TokenStream(){
>                          private boolean done;
>                          private final char[] buffer = new char[1024];
>                          public Token next() throws IOException
>                          {
>                                  if(!done)
>                                  {
>                                          done = true;
>                                          StringBuffer buffer = new StringBuffer();
>                                          int length = 0;
>                                          while(true)
>                                          {
>                                                  length = reader.read(this.buffer);
>                                                  if(length == -1) break;
> 
>                                                  buffer.append(this.buffer,0,length);
>                                          }
>                                          String text = buffer.toString();
>                                          return new Token(text.toUpperCase(),0,text.length());
>                                  }
>                                  return null;
>                          }
>                  };
>          }
>  }
> 
> Which solve the above said problem, but we are not able to the wild card searchs like
MAIN*, etc.
> 
> We need both the functionality ie.
> 1.  if user searches for MAIN BOARD, should get only documents that contain MAIN BOARD
and not MAIN LOGIC, MAIN, MAIN PART etc.
> 2. User should be able to do the wild card search like MAIN*, etc and get the desired
documents.
> 
> Please let us know, how we should do the indexing ? and which analyzer to use to do the
search ?
> 
> thanks
> Rahul...
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message