lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Pilato <da...@pilato.fr>
Subject Re: Configuration for edge ngram typeahead
Date Fri, 04 Jan 2013 22:01:54 GMT
Did you define mappings for your docs and fields to use that analyzer?
See: http://www.elasticsearch.org/guide/reference/api/admin-indices-put-mapping.html

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 4 janv. 2013 à 22:30, "C. Benson Manica" <cbmanica@gmail.com> a écrit :

I have been Googling for an hour with no success whatsoever about how to
configure Lucene (Elasticsearch, actually, but presumably the same deal) to
index edge ngrams for typeahead.  I don't really know how filters,
analyzers, and tokenizers work together - documentation isn't helpful on
that count either - but I managed to cobble together the following
configuration that I thought would work.  It doesn't, though - when I index
documents into the collection with that setting, they still only match
whole words instead of ngrams.  What am I missing?  *Should* this work?
How do I debug this to see if the documents are having these settings
applied to them?

{
   "settings":{
       "index":{
           "analysis":{
               "analyzer":{
                   "typeahead_analyzer":{
                       "type":"custom",
                       "tokenizer":"edgeNGram",
                       "filter":["typeahead_ngram"]
                   }
               },
               "filter":{
                   "typeahead_ngram":{
                       "type":"edgeNGram",
                       "min_gram":1,
                       "max_gram":8,
                       "side":"front"
                   }
               }
           }
       }
   },
   "mappings": {
       "name": {
           "properties": {
               "name": {
                   "type": "string",
                   "analyzer": "typeahead_analyzer"
               }
           }
       }
   }
}

-- 
C. Benson Manica
cbmanica@gmail.com

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message