incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sahand.T" <sasht...@gmail.com>
Subject Re: Need help with Hunspell's .aff syntax
Date Mon, 14 May 2012 19:52:04 GMT
About question #3: I figured it out if anyone has any use for it:

The star (*) is regex for "0 to any amount of the previous character". 9*
matches <blank>, 9, 99, 999, ..., 99999, etc.
The wordlist is hunspell/tests/compoundrule4.dic which contains cardinal
numbers (0-9) and ordinal numbers (0th, 1st, 1th, ..., 8th, 9th). Each
number has one or more flags according to type and suffix. Ordinals ending
in "th" are flagged /t , other flags are /n, /p, /m and /c.

Rule 1 (n*1t): 0-9 in any combination (n*) + 1 + an ordinal number with
suffix for numbers ending in 1X (10th-19th, 2012th).
Rule 2 (n*mp): 0-9 in any combination (n*) + 0-9 (m) + an ordinal number
flagged /p (5th, 8th etc). Suffixes 1-9, 21=<, except those that rule #1
suffixes.

Good luck!


2012/5/10 Sahand.T <sashther@gmail.com>

> Thank you Yakov.
>
> 1.About the TRY attribute. It says that the letters should be in order of
> most used characters to least used. This means that
> "TRY aeis" tried to replace the character "a" in the words first, and then
> "e", "i" etc. So if a person types, 'willang', it will replace "a" before
> "i" and possible find "willing", correct?
>
> Do you happen to know the answers to question #2 and #3 too?
>
> /Sahand
>
>
> 2012/5/10 Yakov Reztsov <yakovr_st@mail.ru>
>
>>
>> > Hello, I just joined this mailing list for the purpose of understanding
>> > Hunspell better. I am trying to create a spell checker for central
>> > kurdish/sorani and am currently looking through examples and playing
>> with
>> > the .aff file.
>> >
>> > I don't really know how mailing lists works but if anyone has answers to
>> > these things I'd appreciate it (follow up questions may arise).
>> >
>> > 1.What does the TRY attribute actually do? I found the manuals
>> cryptical in
>> > their explanation. I understand that it is used to determine wrong
>> > characters in words, I don't get how it does it though or how I should
>> set
>> > it up for my needs.
>> >
>> TRY attribute  use for generate suggestions. It is not apply for correct
>> word.
>> Dictionary will work without this attribute.
>>
>>
>> > 2.Taken from manual4:
>> >
>> >  *"Personal dictionaries are simple word lists. Asterisk at the first
>> > > character position signs prohibition. A second word separated by a
>> slash
>> > > sets the affixation.
>> > > **
>> > > **foo
>> > > **Foo/Simpson
>> > > ***bar
>> > > **
>> > > **In this example, "foo" and "Foo" are personal words, plus Foo will
>> be
>> > > recognized with affixes of Simpson (Foo’s etc.) and bar is a forbidden
>> > > word."*
>> >
>> >
>> >
>> > What does the "affixes of Simpson" mean? Is Simpson a flag/class in the
>> > .aff file or what? Or does it  mean "FooSimpson" will be allowed?
>> >
>> > 3. What does this compoundrule from an en_US.aff mean and how does it
>> make
>> > the rules for adding "st", "th", "nd", "rd" to numbers properly?
>> >
>> > *# ordinal numbers
>> > > **COMPOUNDMIN 1
>> > > **# only in compounds: 1th, 2th, 3th
>> > > **ONLYINCOMPOUND c
>> > > **# compound rules:
>> > > **# 1. [0-9]*1[0-9]th (10th, 11th, 12th, 56714th, etc.)
>> > > **# 2. [0-9]*[02-9](1st|2nd|3rd|[4-9]th) (21st, 22nd, 123rd, 1234th,
>> etc.)
>> > > **COMPOUNDRULE 2
>> > > **COMPOUNDRULE n*1t
>> > > **COMPOUNDRULE n*mp
>> > > **WORDCHARS 0123456789 *
>> >
>> >
>> > 4. When I've created all the rules and a dictionary. Do I then use
>> Hunspell
>> > to generate better .dic/.aff files? If so, how are they better? (words
>> with
>> > prefixes are removed?)
>> >
>>
>> No. Task is completed.
>> But you  can make affix  and dict file from list of words with script.
>>
>> http://hunspell.cvs.sourceforge.net/viewvc/hunspell/hunspell/src/tools/affixcompress?revision=1.1.1.1
>> Files generated by hand, will work better.
>>
>>
>>
> > What else do you need the hunspell source and executables for? Is it for
>> > the testing features or is there something I've missed that is awesome
>> > about having the Hunspell source?
>> >
>> >
>>
>> For testing only.
>>
>>
>>
>>  --
>> Yakov
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message