lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maciej Lizewski (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
Date Wed, 17 Apr 2013 08:51:16 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633907#comment-13633907
] 

Maciej Lizewski commented on LUCENE-2899:
-----------------------------------------

why don't you prepare this as separate project that produces some jars and config files with
instructions on how to add it in solr configuration instead of publishing all changes as patches
to solr sources? I am interested in doing some tests with your library but setting all things
up seems quite complicated and hard to maintain in future... it is just a thought.
                
> Add OpenNLP Analysis capabilities as a module
> ---------------------------------------------
>
>                 Key: LUCENE-2899
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2899
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 4.3
>
>         Attachments: LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch,
LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899-RJN.patch, OpenNLPFilter.java, OpenNLPTokenizer.java,
opennlp_trunk.patch
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a
submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I
have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have
to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads
(PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message