lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tomoko Uchida (JIRA)" <>
Subject [jira] [Commented] (LUCENE-8873) Improve analyzer factoryies' Javadoc.
Date Sat, 22 Jun 2019 04:02:00 GMT


Tomoko Uchida commented on LUCENE-8873:

I'm trying to find handy ways to properly manage / document the properties (for both of developers
and users).
 e.g., The pseudo would look good?
 * Factory for {@link NGramTokenizer}.
 * @since 3.1
 * @lucene.spi {@value #NAME}
public class NGramTokenizerFactory extends TokenizerFactory {

  /** SPI name */
  public static final String NAME = "nGram";

  /** Property {@value #PROP_MAX_GRAM_SIZE} - Maximum gram size */
  public static final String PROP_MAX_GRAM_SIZE = "maxGramSize";
  /** Property {@value #PROP_MIN_GRAM_SIZE} - Minimum gram size */
  public static final String PROP_MIN_GRAM_SIZE = "minGramSize";"maxGramSize", required=false, default=NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE)
  private final int maxGramSize;"minGramSize", required=false, default=NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE)
  private final int minGramSize;

  /** Creates a new NGramTokenizerFactory */
  public NGramTokenizerFactory(Map<String, String> args) {
    /* All properties are derived from annotations (in the superclass's constructor), so we
don't have to set those manually */
    // minGramSize = getInt(args, "minGramSize", NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE);
    // maxGramSize = getInt(args, "maxGramSize", NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE);
    if (!args.isEmpty()) {
      throw new IllegalArgumentException("Unknown parameters: " + args);
[~thetaphi]: if you have anything in your mind (about the interface design), please share
your thoughts.

> Improve analyzer factoryies' Javadoc.
> -------------------------------------
>                 Key: LUCENE-8873
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Tomoko Uchida
>            Priority: Minor
> Currently, the documentation for analyzer factories (subclasses of {{TokenizerFactory}},
{{CharFilterFactory}}, {{TokenFilterFactory}}) still includes lots of Solr schema.xml examples
and not all properties are documented. >From my perspective, the latter is more problematic
because users who want to use the factories have to refer to source code to know what properties
are defined.
> To improve documentation, XML examples should be removed for cleanup, and instead, *all
properties which can be passed to factory constructors should be properly documented*.
> Documentation is often overlooked so some validation rules and standardization effort
would be desired (e.g. marking properties by annotations).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message