lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "C. Benson Manica" <cbman...@gmail.com>
Subject Configuration for edge ngram typeahead
Date Fri, 04 Jan 2013 21:30:18 GMT
I have been Googling for an hour with no success whatsoever about how to
configure Lucene (Elasticsearch, actually, but presumably the same deal) to
index edge ngrams for typeahead.  I don't really know how filters,
analyzers, and tokenizers work together - documentation isn't helpful on
that count either - but I managed to cobble together the following
configuration that I thought would work.  It doesn't, though - when I index
documents into the collection with that setting, they still only match
whole words instead of ngrams.  What am I missing?  *Should* this work?
How do I debug this to see if the documents are having these settings
applied to them?

{
    "settings":{
        "index":{
            "analysis":{
                "analyzer":{
                    "typeahead_analyzer":{
                        "type":"custom",
                        "tokenizer":"edgeNGram",
                        "filter":["typeahead_ngram"]
                    }
                },
                "filter":{
                    "typeahead_ngram":{
                        "type":"edgeNGram",
                        "min_gram":1,
                        "max_gram":8,
                        "side":"front"
                    }
                }
            }
        }
    },
    "mappings": {
        "name": {
            "properties": {
                "name": {
                    "type": "string",
                    "analyzer": "typeahead_analyzer"
                }
            }
        }
    }
}

-- 
C. Benson Manica
cbmanica@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message