lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Eduardo de Cordova <flavio.cord...@datasul.com.br>
Subject Problems with StandardTokenizer
Date Mon, 07 Jul 2003 23:26:16 GMT
People...

	I've created a custom analyser that uses the StandardTokenizer class
to get the tokens from the reader.
	It seemed to work fine but I just noticed that some large documents
are not having all their content properly indexed, but just [the starting]
part of them.
	After some debuging I've found out that StandardTokenizer reads up
to 10001 tokens from the reader.

	Have anybody went through something like that before ? What should I
do as a workaround ?

Thanks !

Flavio

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message