lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tommaso Teofili (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4345) Create a Classification module
Date Wed, 12 Sep 2012 09:17:07 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453853#comment-13453853
] 

Tommaso Teofili commented on LUCENE-4345:
-----------------------------------------

bq. Can we remove the ClassificationException? It only seems to box IOException... we can
just throw IOException directly instead?

sure, we can keep IOException for now

bq. What is the scale that you expect this bayesian classifier to handle? How many training
documents does it need?

I'm doing some benchmarking in these days therefore I should be able to say something about
this shortly.
                
> Create a Classification module
> ------------------------------
>
>                 Key: LUCENE-4345
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4345
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>            Priority: Minor
>         Attachments: LUCENE-4345_2.patch, LUCENE-4345.patch, SOLR-3700_2.patch, SOLR-3700.patch
>
>
> Lucene/Solr can host huge sets of documents containing lots of information in fields
so that these can be used as training examples (w/ features) in order to very quickly create
classifiers algorithms to use on new documents and / or to provide an additional service.
> So the idea is to create a contrib module (called 'classification') to host a ClassificationComponent
that will use already seen data (the indexed documents / fields) to classify new documents
/ text fragments.
> The first version will contain a (simplistic) Lucene based Naive Bayes classifier but
more implementations should be added in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message