opennlp-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boris Galitsky (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OPENNLP-253) Add text similarity / relevance / syntactic match component based on parse trees
Date Wed, 17 Aug 2011 23:58:27 GMT

     [ https://issues.apache.org/jira/browse/OPENNLP-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Boris Galitsky updated OPENNLP-253:
-----------------------------------

    Attachment: text_similarity_proposal_for_opennlp.test.zip

current tests

> Add text similarity / relevance / syntactic match component based on parse trees
> --------------------------------------------------------------------------------
>
>                 Key: OPENNLP-253
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-253
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Parser
>    Affects Versions: 1.6.0
>         Environment: jave
>            Reporter: Boris Galitsky
>             Fix For: 1.6.0
>
>         Attachments: text_similarity_proposal_for_opennlp.test.zip, text_similarity_proposal_for_opennlp.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
>  Proposed component relies on openNLP parser, and gives search engineers a simple relevance
verification tool which relies on machine learning of syntactic parse trees.
> The value for search engineers community is that they dont have to be familiar with NLP
to use syntactic generalization component, which does parsing/chunking by openNLP and then
graph-based learning for relevance assessment (proposed component).
> One of the expected usage scenario is that a search library like lucene is used, and
this component would accept / reject irrelevant search results (according to the proposed
syntactic generalization measure).
> This code has been deployed commercially over last 2 years at datran.com and zvents.com
and is serving > 20 mln users monthly.
> There is a number of publications on this project, including 
> http://portal.acm.org/citation.cfm?id=1881190
> http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS11/paper/view/2573

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message