lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Spelt, for better spelling correction
Date Wed, 21 Mar 2007 01:57:29 GMT
Boy, I'm looking forward to this!  I read some of the background discussion.  I think this
might fit as a Lucene contrib, but we'll be able to tell when the code makes it into JIRA.

Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

----- Original Message ----
From: Martin Haye <martin.haye@gmail.com>
To: java-user@lucene.apache.org
Sent: Tuesday, March 20, 2007 6:24:43 PM
Subject: Spelt, for better spelling correction

As part of XTF, an open source publishing engine that uses Lucene, I
developed a new spelling correction engine specifically to provide "Did you
mean..." links for misspelled queries. I and a small group are preparing
this for submission as a contrib module to Lucene. And we're inviting
interested people to join the discussion about it.

The new engine is being called "Spelt" and differs from the one currently in
Lucene contrib in the following ways:

- More accurate: Much better performance on single-word queries (90% correct
in #1 slot in my tests). On general list including multi-word queries, gets
80%+ correct.
- Multi-word: Handles and corrects multi-word queries such as "harrypotter"
-> "harry potter".
- Fast: In my tests, builds the dictionary more than 30 times faster.
- Small: Dictionary size is roughly a third of that built by the existing
engine.
- Other bells and whistles...

There is already a standalone test program that people can try out, and
we're interested in feedback. If you're interested in discussing, testing,
or previewing, consider joining the Google group:
http://groups.google.com/group/spelt/

--Martin




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message