lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <>
Subject Re: How FST constructed in lucene?
Date Fri, 17 Feb 2017 13:58:39 GMT
Le ven. 17 févr. 2017 à 11:17, krish mohan <> a
écrit :

> During search, whether Lucene uses FST in .tip file to match against the
> terms? How the changes to the index will be updated in FST? Will it be
> re-constructed or will it be updated in existing FST?

Lucene never updates existing files. What will happen in practice when you
add or update documents is that it will build new segments that will build
their own FST. FSTs are also re-built upon merge.

> In case of wildcard and fuzzy queries, Lucene needs to test a large number
> of terms. Will FST be useful to skip some portion of terms from comparing?

Indeed it will. When intersecting the terms dictionary with an automaton,
Lucene basically performs a leap-frog between the terms contained in the
terms dictionary and the terms that are accepted by the automaton. In some
cases, the terms dictionary will lead the iteration and the automaton will
be used to verify that terms match, but in other cases Lucene will ask the
automaton what the next accepted term is and will ask the terms dictionary
to advance on or beyond that term (seekCeil), which uses the FST under the
hood. You can look at AutomatonTermsEnum and FilteredTermsEnum if you would
like to learn more about how it works.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message