lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Basem Narmok (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement
Date Sun, 11 Oct 2009 22:22:31 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764515#action_12764515
] 

Basem Narmok commented on LUCENE-1966:
--------------------------------------

Seems good.

BTW with FAST ESP we never used stopwords, as hits from stopwords get low relevancy (keywords
with high number of hits = low value, low importance, so less relevant), so such hits will
never get into the top results. Also, using stopwords will affect phrase search, most of the
search engines avoid removing them. But, at the end it depends on the client's application,
and what she really wants, as enterprise search could have very specific and different needs
than Internet search.

Anyways, still I am testing the Arabic Analyzer, and I will provide you with more comments
soon. but for the stopwords they are good for now :)

> Arabic Analyzer: Stopwords list needs enhancement
> -------------------------------------------------
>
>                 Key: LUCENE-1966
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1966
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>    Affects Versions: 2.9
>            Reporter: Basem Narmok
>            Assignee: Robert Muir
>            Priority: Trivial
>             Fix For: 3.0
>
>         Attachments: arabic-stopwords-comments.txt, LUCENE-1966.patch, LUCENE-1966.patch
>
>
> The provided Arabic stopwords list needs some enhancements (e.g. it contains a lot of
words that not stopwords, and some cleanup) . patch will be provided with this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message