lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module
Date Mon, 07 May 2012 13:36:53 GMT


Robert Muir commented on LUCENE-2510:

I don't really know much about NamedSPILoader but I think what you're suggesting. How would
we support Factories loading unrelated classes like they can through ResourceLoader now? Assume
they're on the classpath and use Class.forName?

It needs more discussion (and input from Uwe would help!), but it works like Charset.forName("ASCII")
etc. We use this already for codecs and postingsformats (Codec.forName, Codec.listAllCodecs,

Have a look at lucene/core/src/resources/META-INF/services for the idea. Basically you "register"
your classes in
your jar file this way: additional jar files (e.g. look at lucene/test-framework/src/resources/META-INF)
can load more classes.

So this could support some idea like TokenizerFactory.forName("Whitespace") or something simple
like that. So someone would not need to use namespace to be able
to load their analyzer stuff easily, they use whatever package they want and register in their
META_INF. And added jar files (other analysis jars), are automatically available this way.

I think Uwe mentioned this idea before, though I think he had Analyzers in mind (e.g. provide
language code and get back analyzer or something). Anyway thats for another issue :)

Just something worth consideration if we want to make these modules really pluggable. On the
other hand we shouldn't use anything overkill if its not the right fit... 
> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>                 Key: LUCENE-2510
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: modules/analysis
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>             Fix For: 4.0
>         Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch,
LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch,
LUCENE-2510.patch, LUCENE-2510.patch
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the
analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such
> * users could use the old analyzers module with solr, too. This is a good step to use
real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr
users due to large file sizes or dependencies, would be simple optional plugins to solr and
easily available to users that want them.
> Rough sketch in this thread:
> Practically, I havent looked much and don't really have a plan for how this will work
yet, so ideas are very welcome.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message