lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Angel, Eric" <ean...@business.com>
Subject RE: ShingleMatrixFilter for synonyms
Date Wed, 14 Jan 2009 01:32:30 GMT
The unit tests don't really show how I could use it for synonyms at
index time- does anyone have sample code?  Is it possible?

-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
Sent: Tuesday, January 13, 2009 3:06 PM
To: java-user@lucene.apache.org
Subject: Re: ShingleMatrixFilter for synonyms

Eric,

Unit tests should help you see how this can be used:

./contrib/analyzers/src/java/org/apache/lucene/analysis/shingle/ShingleF
ilter.java
./contrib/analyzers/src/java/org/apache/lucene/analysis/shingle/ShingleA
nalyzerWrapper.java
./contrib/analyzers/src/java/org/apache/lucene/analysis/shingle/ShingleM
atrixFilter.java
./contrib/analyzers/src/test/org/apache/lucene/analysis/shingle/ShingleA
nalyzerWrapperTest.java
./contrib/analyzers/src/test/org/apache/lucene/analysis/shingle/TestShin
gleMatrixFilter.java
./contrib/analyzers/src/test/org/apache/lucene/analysis/shingle/ShingleF
ilterTest.java

As for multi-word tokens, you just have to make sure they don't get
injected before something that would remove any portion of them.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: "Angel, Eric" <eangel@business.com>
> To: java-user@lucene.apache.org
> Sent: Tuesday, January 13, 2009 2:39:11 PM
> Subject: ShingleMatrixFilter for synonyms
> 
> Does anyone have an example using this?
> 
> 
> 
> I have a SynonymEngine that returns a an array list of strings, some
of
> which may be multiple words.  How can I incorporate this with my
> SynonymEngine at index time?
> 
> 
> 
> Also, the javadoc for the ShingleMatrixFilter class says:
> 
>             Without a spacer character it can be used to handle
> composition and decomposion of words such as searching for "multi
> dimensional" instead of "multidimensional".
> 
> 
> 
> Does any one have a working example of this?
> 
> 
> 
> 
> 
> Here's my synonym engine (taken from the Lucene In Action book):
> 
> 
> 
> public interface SynonymEngine {
> 
>       public String[] getSynonyms(String word) throws IOException;
> 
> }
> 
> 
> 
> public class DexSynonymEngine implements SynonymEngine {
> 
> 
> 
>       private static Mapmap = new HashMap
> String[]>();
> 
>       
> 
>       static {
> 
>             // numbers
> 
>             map.put("1" , new String[] {"one"});
> 
>             map.put("2" , new String[] {"two"});
> 
>             map.put("3" , new String[] {"three"});
> 
>             map.put("4" , new String[] {"four"});
> 
>             map.put("5" , new String[] {"five"});
> 
>             map.put("6" , new String[] {"six", "seis"});
> 
>             map.put("7" , new String[] {"seven"});
> 
>             map.put("8" , new String[] {"eight"});
> 
>             map.put("9" , new String[] {"nine"});
> 
>             map.put("10" , new String[] {"ten"});
> 
>             map.put("11" , new String[] {"eleven"});
> 
>             map.put("12" , new String[] {"twelve"});
> 
>             map.put("13" , new String[] {"thirteen"});
> 
>             map.put("14" , new String[] {"fourteen"});
> 
>             map.put("15" , new String[] {"fifteen"});
> 
>             map.put("16" , new String[] {"sixteen"});
> 
>             map.put("17" , new String[] {"seventeen"});
> 
>             map.put("18" , new String[] {"eighteen"});
> 
>             map.put("19" , new String[] {"nineteen"});
> 
>             map.put("20" , new String[] {"twenty"});
> 
>             map.put("21" , new String[] {"twenty one"});
> 
>             // words
> 
>             map.put("pharmacy" , new String[] {"drug store"});
> 
>             map.put("pharmacy" , new String[] {"drug store"});
> 
>             map.put("hospital" , new String[] {"medical center"});
> 
>             map.put("fast", new String[]{"quick", "speedy"});
> 
>             map.put("search", new String[]{"explore", "hunt",
"hunting",
> "look"});
> 
>             map.put("sound", new String[]{"audio"});
> 
>             map.put("restaurant", new String[]{"eatery"});
> 
>             
> 
>       }
> 
>       
> 
>       
> 
>       public String[] getSynonyms(String word) throws IOException {
> 
>             return map.get(word);
> 
>       }
> 
> 
> 
> }


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message