lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Aksoy <ahme...@axtelsoft.com>
Subject Re: Top most frequent words
Date Thu, 12 May 2005 08:35:26 GMT
Hi John,
I haven't investigated the sources yet, but you might be right.
However, as you stated, those type of lists directly depend on the 
subject, and the source.
Anyway, it is not very important for my study, and I'm sure it will help 
me very much.
I will prepare optimized lists if I can obtain some different sets.
Best regards.
Ahmet
John Haxby wrote:

> Otis Gospodnetic wrote:
>
>> Somebody asked about this today, and I just found this through Simpy:
>>  http://www.unine.ch/info/clef/
>>
>> Scroll half-way through the page, look on the right side:  1,000 most
>> frequent words for several languages.
>>  
>>
> Hmm.  I'm not sure how valuable that is.   For English "los" and 
> "angeles" are ranked 99 and 101 respectively and "officials" comes in 
> at 125.   Obviously I'm guessing, but those middle ranking words have 
> come from a slightly skewed source -- newspapers in a fixed interval 
> perhaps.  (I don't think "Los Angeles" makes it into every day 
> parlance in the UK, and "officials" suggests that we're obsessed with 
> beauracracy :-))
>
> jch
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
> .
>


-- 
        Ahmet Aksoy
axtelsoft.com - armalink.com
    ahmetax.blogspot.com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message