lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Whelan <phil...@gmail.com>
Subject Re: Weird discrepancy with term counts vs. terms (off by 1)
Date Sun, 02 Aug 2009 16:48:55 GMT
Hi Jim,

On Sun, Aug 2, 2009 at 9:08 AM, Phil Whelan<phil123@gmail.com> wrote:
>
>> So then, I reviewed the index using Luke, and what I saw with that was that there
were indeed only 12 "path" terms (under "Term Count" on the left), but, when I clicked the
"Show Top Terms" in Luke, there were 13 terms listed by Luke.
>
> Yes, I just checked this and this seems to be a bug with Luke. It
> always shows 1 less than in "Term Count" than it should. Well spotted.

I was able to see why this way happening in the Luke source and I've
submitted the following patch to Andrzej, the author of Luke.

Thanks,
Phil

--- luke.orig/src/org/getopt/luke/Luke.java	2009-03-19 22:41:34.000000000 -0700
+++ luke-src-0.9.2/src/org/getopt/luke/Luke.java	2009-08-02
09:33:24.000000000 -0700
@@ -813,23 +813,18 @@
       setString(iFields, "text", String.valueOf(idxFields.length));
       Object iTerms = find(pOver, "iTerms");
       termCounts.clear();
-      FieldTermCount ftc = new FieldTermCount();
+      FieldTermCount ftc = null;
       TermEnum te = ir.terms();
       numTerms = 0;
       while (te.next()) {
         Term currTerm = te.term();
-        if (ftc.fieldname == null) {
+        if (ftc == null || ftc.fieldname == null || ftc.fieldname !=
currTerm.field()) {
           // initialize
-          ftc.fieldname = currTerm.field();
-          termCounts.put(ftc.fieldname, ftc);
-        }
-        if (ftc.fieldname == currTerm.field()) {
-          ftc.termCount++;
-        } else {
           ftc = new FieldTermCount();
           ftc.fieldname = currTerm.field();
           termCounts.put(ftc.fieldname, ftc);
         }
+        ftc.termCount++;
         numTerms++;
       }
       te.close();

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message