Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 69452 invoked from network); 11 Jun 2009 02:13:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 11 Jun 2009 02:13:28 -0000 Received: (qmail 47565 invoked by uid 500); 11 Jun 2009 02:13:39 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 47468 invoked by uid 500); 11 Jun 2009 02:13:39 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 47459 invoked by uid 99); 11 Jun 2009 02:13:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Jun 2009 02:13:39 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Jun 2009 02:13:29 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B4629234C004 for ; Wed, 10 Jun 2009 19:13:07 -0700 (PDT) Message-ID: <1363552283.1244686387724.JavaMail.jira@brutus> Date: Wed, 10 Jun 2009 19:13:07 -0700 (PDT) From: "Mark Miller (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1628) Persian Analyzer In-Reply-To: <459220706.1241370210396.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718275#action_12718275 ] Mark Miller commented on LUCENE-1628: ------------------------------------- Okay, I see that the stopword list for Arabic was committed by Grant with the BSD license. I'll take that as an "its okay" unless anyone speaks up. Thanks for all these great Analyzers Robert. > Persian Analyzer > ---------------- > > Key: LUCENE-1628 > URL: https://issues.apache.org/jira/browse/LUCENE-1628 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/analyzers > Reporter: Robert Muir > Assignee: Mark Miller > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1628.patch, LUCENE-1628.patch > > > A simple persian analyzer. > i measured trec scores with the benchmark package below against http://ece.ut.ac.ir/DBRG/Hamshahri/ : > SimpleAnalyzer: > SUMMARY > Search Seconds: 0.012 > DocName Seconds: 0.020 > Num Points: 981.015 > Num Good Points: 33.738 > Max Good Points: 36.185 > Average Precision: 0.374 > MRR: 0.667 > Recall: 0.905 > Precision At 1: 0.585 > Precision At 2: 0.531 > Precision At 3: 0.513 > Precision At 4: 0.496 > Precision At 5: 0.486 > Precision At 6: 0.487 > Precision At 7: 0.479 > Precision At 8: 0.465 > Precision At 9: 0.458 > Precision At 10: 0.460 > Precision At 11: 0.453 > Precision At 12: 0.453 > Precision At 13: 0.445 > Precision At 14: 0.438 > Precision At 15: 0.438 > Precision At 16: 0.438 > Precision At 17: 0.429 > Precision At 18: 0.429 > Precision At 19: 0.419 > Precision At 20: 0.415 > PersianAnalyzer: > SUMMARY > Search Seconds: 0.004 > DocName Seconds: 0.011 > Num Points: 987.692 > Num Good Points: 36.123 > Max Good Points: 36.185 > Average Precision: 0.481 > MRR: 0.833 > Recall: 0.998 > Precision At 1: 0.754 > Precision At 2: 0.715 > Precision At 3: 0.646 > Precision At 4: 0.646 > Precision At 5: 0.631 > Precision At 6: 0.621 > Precision At 7: 0.593 > Precision At 8: 0.577 > Precision At 9: 0.573 > Precision At 10: 0.566 > Precision At 11: 0.572 > Precision At 12: 0.562 > Precision At 13: 0.554 > Precision At 14: 0.549 > Precision At 15: 0.542 > Precision At 16: 0.538 > Precision At 17: 0.533 > Precision At 18: 0.527 > Precision At 19: 0.525 > Precision At 20: 0.518 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org