lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Brubeck Unhammer (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (LUCENE-1284) Set of Java classes that allow the Lucene search engine to use morphological information developed for the Apertium open-source machine translation platform (http://www.apertium.org)
Date Fri, 18 Feb 2011 13:45:38 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996413#comment-12996413
] 

Kevin Brubeck Unhammer edited comment on LUCENE-1284 at 2/18/11 1:44 PM:
-------------------------------------------------------------------------

A little update: The Java port of lttoolbox has been complete for some time now, and the port
of apertium-tagger at least does disambiguation (training of models, the .prob files, is not
supported yet, but all released pairs come with .prob files so that's a non-issue):

{noformat}
$ echo 'jeg' |apertium-destxt-j |lt-proc-j  nb-nn.automorf.bin | apertium-tagger-j -g nb-nn.prob
-f
^jeg/jeg<prn><p1><mf><sg><nom>/jeg<n><nt><sg><ind>$^./.<sent><clb>$[][
]
{noformat}

The GsoC student Stephen Tigner is working at the moment on making sure they are all usable
as libraries; from what I understand there is just minor cleanup work left on that. 

I can't say anything on license issue though. Other than Stephen Tigner, the most active contributor
on the port is Jacob Nordfalk.

      was (Author: unhammer):
    A little update: The Java port of lttoolbox has been complete for some time now, and the
port of apertium-tagger at least does disambiguation (training of models is not supported
yet though):

{noformat}
$ echo 'jeg' |apertium-destxt-j |lt-proc-j  nb-nn.automorf.bin | apertium-tagger-j -g nb-nn.prob
-f
^jeg/jeg<prn><p1><mf><sg><nom>/jeg<n><nt><sg><ind>$^./.<sent><clb>$[][
]
{noformat}

The GsoC student Stephen Tigner is working at the moment on making sure they are all usable
as libraries; from what I understand there is just minor cleanup work left on that. 

I can't say anything on license issue though. Other than Stephen Tigner, the most active contributor
on the port is Jacob Nordfalk.
  
> Set of Java classes that allow the Lucene search engine to use morphological information
developed for the Apertium open-source machine translation platform (http://www.apertium.org)
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1284
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1284
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/analyzers
>         Environment: New feature developed under GNU/Linux, but it should work in any
other Java-compliance platform
>            Reporter: Felipe Sánchez Martínez
>            Assignee: Otis Gospodnetic
>         Attachments: apertium-morph.0.9.0.tgz
>
>
> Set of Java classes that allow the Lucene search engine to use morphological information
developed for the Apertium open-source machine translation platform (http://www.apertium.org).
Morphological information is used to index new documents and to process smarter queries in
which morphological attributes can be used to specify query terms.
> The tool makes use of morphological analyzers and dictionaries developed for the open-source
machine translation platform Apertium (http://apertium.org) and, optionally, the part-of-speech
taggers developed for it. Currently there are morphological dictionaries available for Spanish,
Catalan, Galician, Portuguese, 
> Aranese, Romanian, French and English. In addition new dictionaries are being developed
for Esperanto, Occitan, Basque, Swedish, Danish, 
> Welsh, Polish and Italian, among others; we hope more language pairs to be added to the
Apertium machine translation platform in the near future.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message