lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From thomas arni <...@zhwin.ch>
Subject Re: Common Words ignoring problem
Date Tue, 20 Mar 2007 06:40:55 GMT
You can adapt the source code of StopAnalyzer.java in the analysis 
package, or I suppose you can use the default constructor with a empty 
stop word list (but please check this).

If you don't know "Luke" use this small tool to display your index and 
verify your index process.
http://www.getopt.org/luke/

Stop words affect the performance. The seize of the index without stop 
words is much small ( up to 40%), because they occur soo often.

Thomas




aslam bari wrote:
> Ok, Thats fine. Thanks
> Now what if i don't want to stop any word, means i want lucene not to ignore any word.How
to do this?. And also doing this will afffect any performance or not?
>
> Thanks...
>
>
> ----- Original Message ----
> From: Grant Ingersoll <gsingers@apache.org>
> To: java-user@lucene.apache.org
> Sent: Monday, 19 March, 2007 5:47:40 PM
> Subject: Re: Common Words ignoring problem
>
>
> One of the constructors for StandardAnalyzer allows you to set your  
> stop words.  If you use the default constructor, you get the default  
> set of stop words, which is in StopAnalyzer.ENGLISH_STOP_WORDS.
>
> -Grant
>
> On Mar 19, 2007, at 6:14 AM, aslam bari wrote:
>
>   
>> Hello All,
>> I am using StandarAnalyzer for indexing documents. Then i make a  
>> query to search some words with And query.
>> For example I need to search for a document which contains  
>> followings all words
>> " this is garden".
>>
>> I think when lucene index the document , it ignores some common  
>> words like "this, is , are , am etc".  But when i say lucene to  
>> search all words including "this" AND "is" AND "garden" , then it  
>> could not found that. Am i right? If yes , how to solve this problem.
>>
>> Thanks...
>>
>>
>>         
>> __________________________________________________________
>> Yahoo! India Answers: Share what you know. Learn something new
>> http://in.answers.yahoo.com/
>>     
>
> --------------------------
> Grant Ingersoll
> Center for Natural Language Processing
> http://www.cnlp.org
>
> Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
> LuceneFAQ
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> 		
> __________________________________________________________
> Yahoo! India Answers: Share what you know. Learn something new
> http://in.answers.yahoo.com/
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message