lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Henriet (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-835) An IndexReader with run-time support for synonyms
Date Mon, 26 Mar 2007 15:39:34 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484139
] 

Benjamin Henriet commented on LUCENE-835:
-----------------------------------------

Hi Mark,
Thank you for your work. You said: "Query-parse-time injection is awkward because special
support is required in the parser/query logic to recognise and cater for the tokens that appear
in the same position." Is there an implementation of the "special support"? I have a similar
problem with dutch word decomposition: at query time i would decompound words like "hulparbeider"
in "hulparbeider" OR "hulp" OR "arbeider" but the parsed query contains only one word group:
 "hulparbeider hulp arbeider".
Can you give me some tip?
Thank you
Benjamin

> An IndexReader with run-time support for synonyms
> -------------------------------------------------
>
>                 Key: LUCENE-835
>                 URL: https://issues.apache.org/jira/browse/LUCENE-835
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.1
>            Reporter: Mark Harwood
>         Assigned To: Mark Harwood
>         Attachments: Synonym.java, SynonymIndexReader.java, SynonymSet.java, TestSynonymIndexReader.java
>
>
> These classes provide support for enabling the use of synonyms for terms in an existing
index.
> While Analyzers can be used at Query-parse time or Index-time to inject synonyms these
are not always satisfactory means of providing support for synonyms:
> * Index-time injection of synonyms is less flexible because changing the lists of synonyms
requires an index rebuild. 
> * Query-parse-time injection is awkward because special support is required in the parser/query
logic  to recognise and cater for the tokens that appear in the same position. Additionally,
any statistical analysis of the index content via TermEnum/TermDocs etc does not consider
the synonyms unless specific code is added.
> What is perhaps more useful is a transparent wrapper for the IndexReader that provides
a synonym-ized view of the index without requiring specialised support in the calling code.
All of the TermEnum/TermDocs interfaces remain the same but behind the scenes synonyms are
being considered/applied silently.
> The classes supplied here provide this "virtual" view of the index and all queries or
other code that examines this index using the special reader benefit from this view without
requiring specialized code. A Junit test illustrates this code in action.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message