lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Eduardo de Cordova <>
Subject Problems with StandardTokenizer
Date Mon, 07 Jul 2003 23:26:16 GMT

	I've created a custom analyser that uses the StandardTokenizer class
to get the tokens from the reader.
	It seemed to work fine but I just noticed that some large documents
are not having all their content properly indexed, but just [the starting]
part of them.
	After some debuging I've found out that StandardTokenizer reads up
to 10001 tokens from the reader.

	Have anybody went through something like that before ? What should I
do as a workaround ?

Thanks !


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message