Return-Path: X-Original-To: apmail-incubator-ooo-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-ooo-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 07CC7C6D2 for ; Thu, 10 May 2012 00:56:49 +0000 (UTC) Received: (qmail 34052 invoked by uid 500); 10 May 2012 00:56:48 -0000 Delivered-To: apmail-incubator-ooo-dev-archive@incubator.apache.org Received: (qmail 33978 invoked by uid 500); 10 May 2012 00:56:48 -0000 Mailing-List: contact ooo-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: ooo-dev@incubator.apache.org Delivered-To: mailing list ooo-dev@incubator.apache.org Received: (qmail 33968 invoked by uid 99); 10 May 2012 00:56:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 May 2012 00:56:48 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sashther@gmail.com designates 209.85.161.175 as permitted sender) Received: from [209.85.161.175] (HELO mail-gg0-f175.google.com) (209.85.161.175) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 May 2012 00:56:42 +0000 Received: by ggcy3 with SMTP id y3so630310ggc.6 for ; Wed, 09 May 2012 17:56:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=vCmdwjc9ZQ3VTvenqRjXYjXY7sGthXORgEbHH5dYEbA=; b=jhQDL068J4D/m9cn3ix8LZIUhQMHPFRgcTOmg3U9kXJo8b3hW1tlSu86MEYLnS+KT8 0mgOJMkUK+S9fMqI0GTusTTHGofCn8n5J3td1Gjwr6wt/QTjb26KQJM4Cx2OVbRqWYen INa+B/w6iIIcCtrTma5+/GSep4zRfD9gzdNJdcoBid5yr2u8hE/KFjR12BqC30vZwXFs RGgIH7vvQvDMHhUVL02sgCWEtZroBXejMtYcxf2eFNuh90vtXzqYec8gWwOUWWBoPBUK cgeTxyUx5JvWS1Zm76y2Y55+V/wBQobxBD2GDWk6p8I/zjcqEay467Ta+/FGLr2DAVVX I8tw== Received: by 10.50.168.106 with SMTP id zv10mr157175igb.55.1336611381213; Wed, 09 May 2012 17:56:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.73.227 with HTTP; Wed, 9 May 2012 17:56:01 -0700 (PDT) From: "Sahand.T" Date: Thu, 10 May 2012 02:56:01 +0200 Message-ID: Subject: Need help with Hunspell's .aff syntax To: ooo-dev@incubator.apache.org Content-Type: multipart/alternative; boundary=e89a8f6430f6947a9f04bfa4167c X-Virus-Checked: Checked by ClamAV on apache.org --e89a8f6430f6947a9f04bfa4167c Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hello, I just joined this mailing list for the purpose of understanding Hunspell better. I am trying to create a spell checker for central kurdish/sorani and am currently looking through examples and playing with the .aff file. I don't really know how mailing lists works but if anyone has answers to these things I'd appreciate it (follow up questions may arise). 1.What does the TRY attribute actually do? I found the manuals cryptical in their explanation. I understand that it is used to determine wrong characters in words, I don't get how it does it though or how I should set it up for my needs. 2.Taken from manual4: *"Personal dictionaries are simple word lists. Asterisk at the first > character position signs prohibition. A second word separated by a slash > sets the affixation. > ** > **foo > **Foo/Simpson > ***bar > ** > **In this example, "foo" and "Foo" are personal words, plus Foo will be > recognized with affixes of Simpson (Foo=92s etc.) and bar is a forbidden > word."* What does the "affixes of Simpson" mean? Is Simpson a flag/class in the .aff file or what? Or does it mean "FooSimpson" will be allowed? 3. What does this compoundrule from an en_US.aff mean and how does it make the rules for adding "st", "th", "nd", "rd" to numbers properly? *# ordinal numbers > **COMPOUNDMIN 1 > **# only in compounds: 1th, 2th, 3th > **ONLYINCOMPOUND c > **# compound rules: > **# 1. [0-9]*1[0-9]th (10th, 11th, 12th, 56714th, etc.) > **# 2. [0-9]*[02-9](1st|2nd|3rd|[4-9]th) (21st, 22nd, 123rd, 1234th, etc.= ) > **COMPOUNDRULE 2 > **COMPOUNDRULE n*1t > **COMPOUNDRULE n*mp > **WORDCHARS 0123456789 * 4. When I've created all the rules and a dictionary. Do I then use Hunspell to generate better .dic/.aff files? If so, how are they better? (words with prefixes are removed?) What else do you need the hunspell source and executables for? Is it for the testing features or is there something I've missed that is awesome about having the Hunspell source? Thanks for reading! It's bedtime for me now so I won't be answering until tomorrow. Nitey! /Sahand --e89a8f6430f6947a9f04bfa4167c--