lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject updating fieldNorms in mass
Date Tue, 14 Feb 2006 22:04:53 GMT

I just noticed the IndexReader.setNorm method(s) today and was extremely
stoked -- after rebuilding my dev index from scratch three times last week
becuase I wanted to try out tweaks to Similarity.lengthNorm the idea of
being able to directly change the norms without rebuildign from scratch is
looking *really* good.

in the case where doc boosts and field boosts aren't used, it seems like
it would be very easy to write a maintenance app that did something
like...

   get instance of similarity based on input
   foreach fieldName in input {
       int[] termCounts = new int[maxDoc];
       foreach Term in TermEnum for field {
          foreach TermDoc on that Term {
              termCounts[td.doc()]+=td.freq()
          }
       }
       foreach doc {
          IndexReader.setNorm(doc, fieldName, similarity.encodeNorm
                  (similarity.lengthNorm(fieldName, termCounts[doc]))
       }
   }


...does anyone see anything wrong with the overall appraoch?

has anyone implimented this already that they'd like to share?  (or any
gotchas they ran into i should be wary of?)


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message