lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: FST Builder pruning
Date Tue, 19 Nov 2013 02:21:39 GMT
Many thanks Mike.

In a given document, I usually have 15-20 fields out of which 6-7 fields
are plain key-value fields.

Typically these key-value fields don't involve prefixes, reflexes, fuzzies
etc...It's always a full match. Non-existent values are also not possible
during search.

In such a case, will it be better for me to use the old TII/TIS formats or
the BlockTree format will be much better for K-V access also?

--
Ravi

On Monday, November 18, 2013, Michael McCandless wrote:

> Yes, BlockTreeTermsWriter uses freezeTail to figure out where to draw
> the lines for assigning terms to blocks, but to build the trie terms
> index it builds a separate FST, by adding in each block's prefix (it
> doesn't use the FST's builder pruning to create the trie).
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Nov 15, 2013 at 1:33 PM, Ravikumar Govindarajan
> <ravikumar.govindarajan@gmail.com <javascript:;>> wrote:
> > Yeah, now I kind of understood.
> >
> > Is this why BlockTreeTermsWriter plugs in it's freezeTail logic of
> meeting
> > min-nbr of terms per block and building a trie for locating sub-blocks?
> >
> > --
> > Ravi
> >
> >
> > On Fri, Nov 15, 2013 at 11:17 PM, Michael McCandless <
> > lucene@mikemccandless.com <javascript:;>> wrote:
> >
> >> When you turn on pruning, FST Builder will just remove nodes that
> >> don't have a high enough count of input terms traversing through them.
> >>  E.g. if minSuffixCount1 is 100 then only FST nodes that see >= 100
> >> input terms coming through them, are preserved.
> >>
> >> You can use this to build a prefix trie instead of the full FST.
> >>
> >> Creating a custom tail freezer is very expert: it lets you implement
> >> arbitrary logic on which nodes are pruned or not.
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >>
> >> On Fri, Nov 15, 2013 at 12:16 PM, Ravikumar Govindarajan
> >> <ravikumar.govindarajan@gmail.com <javascript:;>> wrote:
> >> > I was trying to understand some logic in Builder class of FST.
> >> >
> >> > The method freezeTail() looks quite hairy. I gather that there is an
> some
> >> > logic for pruning a node or compiling it.
> >> >
> >> > What exactly is pruning a node? An example of it will be really really
> >> > helpful
> >> >
> >> > --
> >> > Ravi
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org<javascript:;>
> >> For additional commands, e-mail: java-user-help@lucene.apache.org<javascript:;>
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org<javascript:;>
> For additional commands, e-mail: java-user-help@lucene.apache.org<javascript:;>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message