lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1628) Persian Analyzer
Date Tue, 14 Jul 2009 23:21:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731198#action_12731198
] 

Mark Miller commented on LUCENE-1628:
-------------------------------------

bq. mark, i'm sorry you had to reformat it. 

No worries - I certainly didn't have to. I just ran it because I recently re-added it to eclipse
today. Certainly wasn't necessary, and perhaps there are more than one of these files floating
around out there with a slight difference?

No big deal at all, just wanted to mention the change - I wouldn't have even made the patch
other than to remove the imports and they are not a big deal either. There are a bunch in
Lucene right now. And there is some crazy, whacky formatting as well. Its easy to be anal
about the small stuff when someone else has done all the work on the big stuff ;)

> Persian Analyzer
> ----------------
>
>                 Key: LUCENE-1628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1628
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1628.patch, LUCENE-1628.patch, LUCENE-1628.patch, LUCENE-1628.txt
>
>
> A simple persian analyzer.
> i measured trec scores with the benchmark package below against http://ece.ut.ac.ir/DBRG/Hamshahri/
:
> SimpleAnalyzer:
> SUMMARY
>   Search Seconds:         0.012
>   DocName Seconds:        0.020
>   Num Points:           981.015
>   Num Good Points:       33.738
>   Max Good Points:       36.185
>   Average Precision:      0.374
>   MRR:                    0.667
>   Recall:                 0.905
>   Precision At 1:         0.585
>   Precision At 2:         0.531
>   Precision At 3:         0.513
>   Precision At 4:         0.496
>   Precision At 5:         0.486
>   Precision At 6:         0.487
>   Precision At 7:         0.479
>   Precision At 8:         0.465
>   Precision At 9:         0.458
>   Precision At 10:        0.460
>   Precision At 11:        0.453
>   Precision At 12:        0.453
>   Precision At 13:        0.445
>   Precision At 14:        0.438
>   Precision At 15:        0.438
>   Precision At 16:        0.438
>   Precision At 17:        0.429
>   Precision At 18:        0.429
>   Precision At 19:        0.419
>   Precision At 20:        0.415
> PersianAnalyzer:
> SUMMARY
>   Search Seconds:         0.004
>   DocName Seconds:        0.011
>   Num Points:           987.692
>   Num Good Points:       36.123
>   Max Good Points:       36.185
>   Average Precision:      0.481
>   MRR:                    0.833
>   Recall:                 0.998
>   Precision At 1:         0.754
>   Precision At 2:         0.715
>   Precision At 3:         0.646
>   Precision At 4:         0.646
>   Precision At 5:         0.631
>   Precision At 6:         0.621
>   Precision At 7:         0.593
>   Precision At 8:         0.577
>   Precision At 9:         0.573
>   Precision At 10:        0.566
>   Precision At 11:        0.572
>   Precision At 12:        0.562
>   Precision At 13:        0.554
>   Precision At 14:        0.549
>   Precision At 15:        0.542
>   Precision At 16:        0.538
>   Precision At 17:        0.533
>   Precision At 18:        0.527
>   Precision At 19:        0.525
>   Precision At 20:        0.518

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message