lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: Overall doc-count in TermStats, during flush...
Date Wed, 20 Mar 2013 13:19:42 GMT
Thanks Simon for the quick update...

We always have uniform docs with same set of fields added and that led to
the confusion.

--
Ravi

On Wed, Mar 20, 2013 at 6:33 PM, Simon Willnauer
<simon.willnauer@gmail.com>wrote:

> The BitSet basically counts how many documents have one or more values
> in this field. Some docs might not have values in this field.
> state.segmentInfo.getDocCount() is the # of docs in this segment but
> we are flushing a single field here.  We pass down the cardinality
> here since
> we keep the statistics of the doc count per field in the index since
> 4.0 so we can't use the segmetns doc count.
>
> hope that helps
>
> simon
>
> On Wed, Mar 20, 2013 at 1:12 PM, Ravikumar Govindarajan
> <ravikumar.govindarajan@gmail.com> wrote:
> > This is an internal code I came across in lucene today and unable to
> > decipher it.
> >
> > FreqProxTermsWriterPerField.java
> >
> > void flush(String fieldName, FieldsConsumer consumer,  final
> > SegmentWriteState state)
> > {
> > .............
> > FixedBitSet visitedDocs = new
> FixedBitSet(state.segmentInfo.getDocCount());
> >   for (int i = 0; i < numTerms; i++)
> >   {
> >     .............
> >     visitedDocs.set(docID);
> >     .........
> >     termsConsumer.finishTerm(text, new TermStats(docFreq, writeTermFreq ?
> > totTF : -1)); *//We plan to pass the state.segmentInfo.getDocCount() in
> > TermStats, above. Is it      *
> > *    wrong to do this here?*
> >   }
> > //Once all terms are over
> > termsConsumer.finish(writeTermFreq ? sumTotalTermFreq : -1, sumDocFreq,
> > visitedDocs.cardinality()); *//Why are we doing cardinality() instead of
> > getDocCount() here?*
> > *//Can there be un-visited docs during a flush?*
> > }
> > *
> > *
> > Can someone help me understand this?
> >
> > --
> > Ravi
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message