lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Confusing norms
Date Thu, 26 May 2011 08:32:05 GMT
Hi

I wrote the following test:

{code}
  public void testConfusingNorms() throws Exception {
    Directory dir = newDirectory();
    LogMergePolicy lmp = newLogMergePolicy(false);
    IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
        new MockAnalyzer(random)).setMergePolicy(lmp);
    IndexWriter w = new IndexWriter(dir, conf);
    Document doc = new Document();
    doc.add(new Field("c", "some text", Store.YES, Index.ANALYZED));
    w.addDocument(doc);
    doc = new Document();
    doc.add(new Field("c", "delete", Store.NO,
Index.NOT_ANALYZED_NO_NORMS));
    w.addDocument(doc);
    w.close();

    IndexReader r = IndexReader.open(dir, false);
    r.setNorm(0, "c", (byte) 1);
    r.close();

    // Look for the sep norms file
    boolean found = false;
    for (String s : dir.listAll()) {
      if (IndexFileNames.isSeparateNormsFile(s)) {
        found = true;
        break;
      }
    }
    assertTrue("separate norms file not found", found);

    dir.close();
  }
{code}

You will also need to add that method to IndexFileNames (not committed yet):
{code}
  /**
   * Returns true if the given filename ends with the separate norms file
   * pattern: {@code SEPARATE_NORMS_EXTENSION + "[0-9]+"}.
   */
  public static boolean isSeparateNormsFile(String filename) {
    int idx = filename.lastIndexOf('.');
    if (idx == -1) return false;
    String ext = filename.substring(idx + 1);
    return Pattern.matches(SEPARATE_NORMS_EXTENSION + "[0-9]+", ext);
  }
{code}

The test adds two documents with a field "c", one analyzed and one not and
also no norms. According to "NOT_ANALYZED_NO_NORMS":

Note that once you index a given field *with* norms enabled, disabling norms
> will have no effect.
> In other words, for this to have the above described effect on a field, all
> instances of that field
> must be indexed with NOT_ANALYZED_NO_NORMS from the beginning.
>

I'd expect that since I add one instance of the field w/ norms enabled, then
norms will exist for that field, however that's not the case.

The code which sets the norms by IndexReader does not do anything, because
SegmentReader.doSetNorms thinks this is not an indexed field (or assuming
the documentation is wrong, a field w/o norms):

  protected void doSetNorm(int doc, String field, byte value) throws
IOException {
    SegmentNorms norm = norms.get(field);
    if (norm == null)                             // not an indexed field
      return;

The same test runs fine on 3x, so I assume there is a bug in the code
somewhere only on trunk?

Shai

Mime
View raw message