lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: setMaxClauseCount ??
Date Wed, 21 Jan 2004 15:31:25 GMT
Andrzej Bialecki wrote:
> Karl Koch wrote:
>> I actually wanted to add a large amount of text from an existing 
>> document to
>> find a close related one. Can you suggest another good way of doing 
>> this.
>
> You should try to reduce the dimensionality by reducing the number of 
> unique features. In this case, you could for example use only keywords 
> (or key phrases) instead of the full content of documents.

Indeed, this is a good approach.  In my experience, six or eight terms 
are usually enough, and they needn't all be required.

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message