lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Causse <dcau...@spotter.com>
Subject Re: [OT] About stopwords
Date Thu, 27 Nov 2008 14:00:00 GMT
Thanks for the tip,

but I can't imagine the number of documents google has to join in order 
process such results...
There must be a trick.
Maybe stopwords are not indexed alone but twice with previous and next 
token, some sort of 2-gram index?

David.

Aleksander M. Stensby a écrit :
> Your query includeds apostrophes which tells google to include common 
> words in the query.
> But, if you remove the apostrophes, you will still get results, as 
> google states:
>
> "Google ignores stop words when they're placed in searches alongside 
> less common words. For example, a search for [ The Sound and the Fury 
> ] will only return results for the terms "Sound" and "Fury." However, 
> a search that only includes stop words -- [ The Who ], for example -- 
> will be processed as is."
>
> The key here is "when they're placed in searches alongside less common 
> words".
> http://www.google.com/support/bin/answer.py?hl=en&answer=981
>
>
> Hope that answers your questions.
> Regards,
>  Aleks
>
>
> On Thu, 27 Nov 2008 14:34:00 +0100, David Causse <dcausse@spotter.com> 
> wrote:
>
>> Hi,
>>
>> Look at this google query : 
>> http://www.google.fr/search?q=%22HOW+at+at+of+a+A+a%22
>>
>> What do you think about that concerning stop words?
>> Google has no stop words?
>>
>> David.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message