lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gindin <vgin...@detectum.com>
Subject Re: Terminology. LeafReader -> TermEnum -> PostingsEnum
Date Thu, 14 Dec 2017 09:39:21 GMT
I made a mistake in issue 5. The real case is the PostingEnum has many
implementations, not the DocIdSetIterator. Please read the question 5 as
follows.

5. Should I use a concrete implementation of PostingEnum? When it makes
sense? Or I always should get PostingsEnum as a result of a call
TermEnum.postings(...)?

I forgot one interesting question.

6. PostingEnum has the field AttributeSource atts attribute source. It
looks like a connection point with query Analyzer here. Is it true? If yes
it could be very useful for me and what is appropriate usage scheme of this
attribute? Let's assume that I need to keep some coefficients along with
tokens to use them further in scoring. For example, if the matched token is
a synonym - I could multiple the query score to 0.75.

Regards,
Vadim Gindin

On Thu, Dec 14, 2017 at 2:15 PM, Vadim Gindin <vgindin@detectum.com> wrote:

> Hi All
>
> I have a question about API. Particularly, about used terminology.
>
> 1. LeafReader. Why it starts with "Leaf"? Can I understand that, that such
> reader is intended for reading only one leaf of index tree? Does it mean
> that it is working inside a context (LeafReaderContext) of several
> documents "physically" located in that leaf?
>
> 2.  Our LeafReader is positioned in some document, and reader.terms(field)
> will return terms list for the single field from the index. Right?
>
> 3. LeafReader is the successor of IndexReader, which has getTermVectors(
> int docID)
> Can I use it in my custom Query (to be aware of all documents fields)
> instead of terms(field)
>
> 4. I.e. LeafReader contains statistical methods, methods returning the
> document values, and the methods returning terms and postings. terms()
> and postings() are intended for search.
>
> 3. What is Postings/PostingEnum? Why is it named starting with "Posting"?
> My native language is Russian and I'm a bit confused trying to find a
> corresponding meaning of this word in a search context.
>
> 5. Ok, I see PostingEnum implements some basic interface DocIdSetIterator,
> but PostingEnum is one of approximately 20 implementations of that
> interface. Why is it used in LeafReader? What the principal difference
> between these 20 implementations and which of them can be really useful?
>
> Regards,
> Vadim Gindin
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message