lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: WhiteSpaceTokenizer
Date Fri, 15 Aug 2014 11:48:53 GMT
Yeah, it should be documented better, and configurable.

Some discussion of related issues here:
https://issues.apache.org/jira/browse/LUCENE-1118
https://issues.apache.org/jira/browse/SOLR-4148

I actually filed a Jira for this already. No action so far, but PLEASE feel 
free to comment on it:
https://issues.apache.org/jira/browse/LUCENE-5785

-- Jack Krupansky

-----Original Message----- 
From: Sheng
Sent: Thursday, August 14, 2014 11:38 PM
To: java-user@lucene.apache.org
Subject: WhiteSpaceTokenizer

The length of token has to be shorter than 255, otherwise there will
be unpredictable behaviors for this tokenizer. I see 255 is set as a
private final in the src code, but there is no documentation to explicitly
address that. Can we either make that number configurable (if not an
option, I'd like to know why), or put some notes to its java doc? I had a
hard time to figure that out... 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message