I'm creating a tokenized "content" Field from a plain text file
using an InputStreamReader and new Field("content", in);
The text file is large, 20 MB, and contains zillions lines,
each with the the same 100-character token.
That causes an OutOfMemoryError.
Given that all tokens are the *same*,
why should this cause an OutOfMemoryError?
Shouldn't StandardAnalyzer just chug along
and just note "ho hum, this token is the same"?
That shouldn't take too much memory.
Or have I missed something?
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|