incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yakov Reztsov <yakovr...@mail.ru>
Subject Re: Need help with Hunspell's .aff syntax
Date Thu, 10 May 2012 17:52:48 GMT

> Hello, I just joined this mailing list for the purpose of understanding
> Hunspell better. I am trying to create a spell checker for central
> kurdish/sorani and am currently looking through examples and playing with
> the .aff file.
> 
> I don't really know how mailing lists works but if anyone has answers to
> these things I'd appreciate it (follow up questions may arise).
> 
> 1.What does the TRY attribute actually do? I found the manuals cryptical in
> their explanation. I understand that it is used to determine wrong
> characters in words, I don't get how it does it though or how I should set
> it up for my needs.
> 
TRY attribute  use for generate suggestions. It is not apply for correct word.
Dictionary will work without this attribute.


> 2.Taken from manual4:
> 
>  *"Personal dictionaries are simple word lists. Asterisk at the first
> > character position signs prohibition. A second word separated by a slash
> > sets the affixation.
> > **
> > **foo
> > **Foo/Simpson
> > ***bar
> > **
> > **In this example, "foo" and "Foo" are personal words, plus Foo will be
> > recognized with affixes of Simpson (Foo’s etc.) and bar is a forbidden
> > word."*
> 
> 
> 
> What does the "affixes of Simpson" mean? Is Simpson a flag/class in the
> .aff file or what? Or does it  mean "FooSimpson" will be allowed?
> 
> 3. What does this compoundrule from an en_US.aff mean and how does it make
> the rules for adding "st", "th", "nd", "rd" to numbers properly?
> 
> *# ordinal numbers
> > **COMPOUNDMIN 1
> > **# only in compounds: 1th, 2th, 3th
> > **ONLYINCOMPOUND c
> > **# compound rules:
> > **# 1. [0-9]*1[0-9]th (10th, 11th, 12th, 56714th, etc.)
> > **# 2. [0-9]*[02-9](1st|2nd|3rd|[4-9]th) (21st, 22nd, 123rd, 1234th, etc.)
> > **COMPOUNDRULE 2
> > **COMPOUNDRULE n*1t
> > **COMPOUNDRULE n*mp
> > **WORDCHARS 0123456789 *
> 
> 
> 4. When I've created all the rules and a dictionary. Do I then use Hunspell
> to generate better .dic/.aff files? If so, how are they better? (words with
> prefixes are removed?)
> 

No. Task is completed.
But you  can make affix  and dict file from list of words with script. 
http://hunspell.cvs.sourceforge.net/viewvc/hunspell/hunspell/src/tools/affixcompress?revision=1.1.1.1
Files generated by hand, will work better.


> What else do you need the hunspell source and executables for? Is it for
> the testing features or is there something I've missed that is awesome
> about having the Hunspell source?
> 
> 

For testing only.



 --
Yakov
Mime
View raw message