lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-1287) Allow usage of HyphenationCompoundWordTokenFilter without dictionary
Date Tue, 18 May 2010 01:16:44 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir updated LUCENE-1287:
--------------------------------

    Attachment: LUCENE-1287.patch

I think this is a nice feature.

I see some interesting hyphenation-only results presented here:
http://lwa09.informatik.tu-darmstadt.de/pub/IR/WebHome/wir2009_leveling.pdf

I updated your patch to trunk, I would like to commit to trunk/3x in a few days if no one
objects.

> Allow usage of HyphenationCompoundWordTokenFilter without dictionary
> --------------------------------------------------------------------
>
>                 Key: LUCENE-1287
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1287
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/analyzers
>            Reporter: Thomas Peuss
>            Assignee: Robert Muir
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-1287.patch, LUCENE-1287.patch
>
>
> We should allow to use the HyphenationCompoundWordTokenFilter without a dictionary. This
produces a lot of "nonword" tokens but might be useful sometimes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message