lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rm...@apache.org
Subject svn commit: r1604355 - in /lucene/dev/branches/branch_4x: ./ lucene/ lucene/analysis/ lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/ lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/
Date Sat, 21 Jun 2014 13:19:03 GMT
Author: rmuir
Date: Sat Jun 21 13:19:02 2014
New Revision: 1604355

URL: http://svn.apache.org/r1604355
Log:
LUCENE-5778: support hunspell morphological description fields

Added:
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestMorphAlias.java
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestMorphAlias.java
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestMorphData.java
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestMorphData.java
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestSpaces.java
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestSpaces.java
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphalias.aff
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphalias.aff
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphalias.dic
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphalias.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphdata.aff
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphdata.aff
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphdata.dic
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morphdata.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/spaces.aff
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/spaces.aff
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/spaces.dic
      - copied unchanged from r1604354, lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/spaces.dic
Modified:
    lucene/dev/branches/branch_4x/   (props changed)
    lucene/dev/branches/branch_4x/lucene/   (props changed)
    lucene/dev/branches/branch_4x/lucene/CHANGES.txt
    lucene/dev/branches/branch_4x/lucene/analysis/   (props changed)
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Stemmer.java
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries2.java
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/circumfix.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/conv.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/dependencies.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/homonyms.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/ignore.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morph.dic
    lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/twosuffixes.dic

Modified: lucene/dev/branches/branch_4x/lucene/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/CHANGES.txt?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/CHANGES.txt (original)
+++ lucene/dev/branches/branch_4x/lucene/CHANGES.txt Sat Jun 21 13:19:02 2014
@@ -5,6 +5,11 @@ http://s.apache.org/luceneversions
 
 ======================= Lucene 4.10.0 ======================
 
+New Features
+
+* LUCENE-5778: Support hunspell morphological description fields/aliases.
+  (Robert Muir)
+
 Optimizations
 
 * LUCENE-5780: Make OrdinalMap more memory-efficient, especially in case the

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java
Sat Jun 21 13:19:02 2014
@@ -27,6 +27,7 @@ import org.apache.lucene.util.IntsRef;
 import org.apache.lucene.util.OfflineSorter;
 import org.apache.lucene.util.OfflineSorter.ByteSequencesReader;
 import org.apache.lucene.util.OfflineSorter.ByteSequencesWriter;
+import org.apache.lucene.util.RamUsageEstimator;
 import org.apache.lucene.util.automaton.CharacterRunAutomaton;
 import org.apache.lucene.util.automaton.RegExp;
 import org.apache.lucene.util.fst.Builder;
@@ -74,6 +75,7 @@ public class Dictionary {
   static final char[] NOFLAGS = new char[0];
   
   private static final String ALIAS_KEY = "AF";
+  private static final String MORPH_ALIAS_KEY = "AM";
   private static final String PREFIX_KEY = "PFX";
   private static final String SUFFIX_KEY = "SFX";
   private static final String FLAG_KEY = "FLAG";
@@ -115,9 +117,21 @@ public class Dictionary {
 
   private FlagParsingStrategy flagParsingStrategy = new SimpleFlagParsingStrategy(); // Default
flag parsing strategy
 
+  // AF entries
   private String[] aliases;
   private int aliasCount = 0;
   
+  // AM entries
+  private String[] morphAliases;
+  private int morphAliasCount = 0;
+  
+  // st: morphological entries (either directly, or aliased from AM)
+  private String[] stemExceptions = new String[8];
+  private int stemExceptionCount = 0;
+  // we set this during sorting, so we know to add an extra FST output.
+  // when set, some words have exceptional stems, and the last entry is a pointer to stemExceptions
+  boolean hasStemExceptions;
+  
   private final File tempDir = OfflineSorter.defaultTempDir(); // TODO: make this configurable?
   
   boolean ignoreCase;
@@ -194,6 +208,7 @@ public class Dictionary {
       readDictionaryFiles(dictionaries, decoder, b);
       words = b.finish();
       aliases = null; // no longer needed
+      morphAliases = null; // no longer needed
     } finally {
       IOUtils.closeWhileHandlingException(out, aff1, aff2);
       aff.delete();
@@ -278,6 +293,8 @@ public class Dictionary {
       }
       if (line.startsWith(ALIAS_KEY)) {
         parseAlias(line);
+      } else if (line.startsWith(MORPH_ALIAS_KEY)) {
+        parseMorphAlias(line);
       } else if (line.startsWith(PREFIX_KEY)) {
         parseAffix(prefixes, line, reader, PREFIX_CONDITION_REGEX_PATTERN, seenPatterns,
seenStrips);
       } else if (line.startsWith(SUFFIX_KEY)) {
@@ -639,23 +656,69 @@ public class Dictionary {
   }
 
   final char FLAG_SEPARATOR = 0x1f; // flag separator after escaping
+  final char MORPH_SEPARATOR = 0x1e; // separator for boundary of entry (may be followed
by morph data)
   
   String unescapeEntry(String entry) {
     StringBuilder sb = new StringBuilder();
-    for (int i = 0; i < entry.length(); i++) {
+    int end = morphBoundary(entry);
+    for (int i = 0; i < end; i++) {
       char ch = entry.charAt(i);
       if (ch == '\\' && i+1 < entry.length()) {
         sb.append(entry.charAt(i+1));
         i++;
       } else if (ch == '/') {
         sb.append(FLAG_SEPARATOR);
+      } else if (ch == MORPH_SEPARATOR || ch == FLAG_SEPARATOR) {
+        // BINARY EXECUTABLES EMBEDDED IN ZULU DICTIONARIES!!!!!!!
       } else {
         sb.append(ch);
       }
     }
+    sb.append(MORPH_SEPARATOR);
+    if (end < entry.length()) {
+      for (int i = end; i < entry.length(); i++) {
+        char c = entry.charAt(i);
+        if (c == FLAG_SEPARATOR || c == MORPH_SEPARATOR) {
+          // BINARY EXECUTABLES EMBEDDED IN ZULU DICTIONARIES!!!!!!!
+        } else {
+          sb.append(c);
+        }
+      }
+    }
     return sb.toString();
   }
   
+  static int morphBoundary(String line) {
+    int end = indexOfSpaceOrTab(line, 0);
+    if (end == -1) {
+      return line.length();
+    }
+    while (end >= 0 && end < line.length()) {
+      if (line.charAt(end) == '\t' ||
+          end+3 < line.length() && 
+          Character.isLetter(line.charAt(end+1)) && 
+          Character.isLetter(line.charAt(end+2)) &&
+          line.charAt(end+3) == ':') {
+        break;
+      }
+      end = indexOfSpaceOrTab(line, end+1);
+    }
+    if (end == -1) {
+      return line.length();
+    }
+    return end;
+  }
+  
+  static int indexOfSpaceOrTab(String text, int start) {
+    int pos1 = text.indexOf('\t', start);
+    int pos2 = text.indexOf(' ', start);
+    if (pos1 >= 0 && pos2 >= 0) {
+      return Math.min(pos1, pos2);
+    } else {
+      return Math.max(pos1, pos2);
+    }
+  }
+  
   /**
    * Reads the dictionary file through the provided InputStreams, building up the words map
    *
@@ -678,9 +741,23 @@ public class Dictionary {
         String line = lines.readLine(); // first line is number of entries (approximately,
sometimes)
         
         while ((line = lines.readLine()) != null) {
+          // wild and unpredictable code comment rules
+          if (line.isEmpty() || line.charAt(0) == '/' || line.charAt(0) == '#' || line.charAt(0)
== '\t') {
+            continue;
+          }
           line = unescapeEntry(line);
+          // if we havent seen any stem exceptions, try to parse one
+          if (hasStemExceptions == false) {
+            int morphStart = line.indexOf(MORPH_SEPARATOR);
+            if (morphStart >= 0 && morphStart < line.length()) {
+              hasStemExceptions = parseStemException(line.substring(morphStart+1)) != null;
+            }
+          }
           if (needsInputCleaning) {
-            int flagSep = line.lastIndexOf(FLAG_SEPARATOR);
+            int flagSep = line.indexOf(FLAG_SEPARATOR);
+            if (flagSep == -1) {
+              flagSep = line.indexOf(MORPH_SEPARATOR);
+            }
             if (flagSep == -1) {
               CharSequence cleansed = cleanInput(line, sb);
               writer.write(cleansed.toString().getBytes(StandardCharsets.UTF_8));
@@ -720,7 +797,7 @@ public class Dictionary {
         scratch1.length = o1.length;
         
         for (int i = scratch1.length - 1; i >= 0; i--) {
-          if (scratch1.bytes[scratch1.offset + i] == FLAG_SEPARATOR) {
+          if (scratch1.bytes[scratch1.offset + i] == FLAG_SEPARATOR || scratch1.bytes[scratch1.offset
+ i] == MORPH_SEPARATOR) {
             scratch1.length = i;
             break;
           }
@@ -731,7 +808,7 @@ public class Dictionary {
         scratch2.length = o2.length;
         
         for (int i = scratch2.length - 1; i >= 0; i--) {
-          if (scratch2.bytes[scratch2.offset + i] == FLAG_SEPARATOR) {
+          if (scratch2.bytes[scratch2.offset + i] == FLAG_SEPARATOR || scratch2.bytes[scratch2.offset
+ i] == MORPH_SEPARATOR) {
             scratch2.length = i;
             break;
           }
@@ -763,22 +840,15 @@ public class Dictionary {
       line = scratchLine.utf8ToString();
       String entry;
       char wordForm[];
-      
-      int flagSep = line.lastIndexOf(FLAG_SEPARATOR);
+      int end;
+
+      int flagSep = line.indexOf(FLAG_SEPARATOR);
       if (flagSep == -1) {
         wordForm = NOFLAGS;
-        entry = line;
+        end = line.indexOf(MORPH_SEPARATOR);
+        entry = line.substring(0, end);
       } else {
-        // note, there can be comments (morph description) after a flag.
-        // we should really look for any whitespace: currently just tab and space
-        int end = line.indexOf('\t', flagSep);
-        if (end == -1)
-          end = line.length();
-        int end2 = line.indexOf(' ', flagSep);
-        if (end2 == -1)
-          end2 = line.length();
-        end = Math.min(end, end2);
-        
+        end = line.indexOf(MORPH_SEPARATOR);
         String flagPart = line.substring(flagSep + 1, end);
         if (aliasCount > 0) {
           flagPart = getAliasValue(Integer.parseInt(flagPart));
@@ -788,6 +858,19 @@ public class Dictionary {
         Arrays.sort(wordForm);
         entry = line.substring(0, flagSep);
       }
+      // we possibly have morphological data
+      int stemExceptionID = 0;
+      if (hasStemExceptions && end+1 < line.length()) {
+        String stemException = parseStemException(line.substring(end+1));
+        if (stemException != null) {
+          if (stemExceptionCount == stemExceptions.length) {
+            int newSize = ArrayUtil.oversize(stemExceptionCount+1, RamUsageEstimator.NUM_BYTES_OBJECT_REF);
+            stemExceptions = Arrays.copyOf(stemExceptions, newSize);
+          }
+          stemExceptionID = stemExceptionCount+1; // we use '0' to indicate no exception
for the form
+          stemExceptions[stemExceptionCount++] = stemException;
+        }
+      }
 
       int cmp = currentEntry == null ? 1 : entry.compareTo(currentEntry);
       if (cmp < 0) {
@@ -809,8 +892,14 @@ public class Dictionary {
           currentEntry = entry;
           currentOrds = new IntsRef(); // must be this way
         }
-        currentOrds.grow(currentOrds.length+1);
-        currentOrds.ints[currentOrds.length++] = ord;
+        if (hasStemExceptions) {
+          currentOrds.grow(currentOrds.length+2);
+          currentOrds.ints[currentOrds.length++] = ord;
+          currentOrds.ints[currentOrds.length++] = stemExceptionID;
+        } else {
+          currentOrds.grow(currentOrds.length+1);
+          currentOrds.ints[currentOrds.length++] = ord;
+        }
       }
     }
     
@@ -868,6 +957,46 @@ public class Dictionary {
       throw new IllegalArgumentException("Bad flag alias number:" + id, ex);
     }
   }
+  
+  String getStemException(int id) {
+    return stemExceptions[id-1];
+  }
+  
+  private void parseMorphAlias(String line) {
+    if (morphAliases == null) {
+      //first line should be the aliases count
+      final int count = Integer.parseInt(line.substring(3));
+      morphAliases = new String[count];
+    } else {
+      String arg = line.substring(2); // leave the space
+      morphAliases[morphAliasCount++] = arg;
+    }
+  }
+  
+  private String parseStemException(String morphData) {
+    // first see if its an alias
+    if (morphAliasCount > 0) {
+      try {
+        int alias = Integer.parseInt(morphData.trim());
+        morphData = morphAliases[alias-1];
+      } catch (NumberFormatException e) {  
+        // fine
+      }
+    }
+    // try to parse morph entry
+    int index = morphData.indexOf(" st:");
+    if (index < 0) {
+      index = morphData.indexOf("\tst:");
+    }
+    if (index >= 0) {
+      int endIndex = indexOfSpaceOrTab(morphData, index+1);
+      if (endIndex < 0) {
+        endIndex = morphData.length();
+      }
+      return morphData.substring(index+4, endIndex);
+    }
+    return null;
+  }
 
   /**
    * Abstraction of the process of parsing flags taken from the affix and dic files

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Stemmer.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Stemmer.java?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Stemmer.java
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Stemmer.java
Sat Jun 21 13:19:02 2014
@@ -20,7 +20,6 @@ package org.apache.lucene.analysis.hunsp
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Arrays;
-import java.util.Collections;
 import java.util.List;
 
 import org.apache.lucene.analysis.util.CharArraySet;
@@ -48,6 +47,10 @@ final class Stemmer {
   private final StringBuilder scratchSegment = new StringBuilder();
   private char scratchBuffer[] = new char[32];
   
+  // its '1' if we have no stem exceptions, otherwise every other form
+  // is really an ID pointing to the exception table
+  private final int formStep;
+  
   /**
    * Constructs a new Stemmer which will use the provided Dictionary to create its stems.
    *
@@ -66,6 +69,7 @@ final class Stemmer {
         suffixReaders[level] = dictionary.suffixes.getBytesReader();
       }
     }
+    formStep = dictionary.hasStemExceptions ? 2 : 1;
   } 
   
   /**
@@ -101,12 +105,12 @@ final class Stemmer {
     if (forms != null) {
       // TODO: some forms should not be added, e.g. ONLYINCOMPOUND
       // just because it exists, does not make it valid...
-      for (int i = 0; i < forms.length; i++) {
-        stems.add(newStem(word, length));
+      for (int i = 0; i < forms.length; i += formStep) {
+        stems.add(newStem(word, length, forms, i));
       }
     }
     try {
-      stems.addAll(stem(word, length, -1, -1, -1, 0, true, true, false, false));
+      boolean v = stems.addAll(stem(word, length, -1, -1, -1, 0, true, true, false, false));
     } catch (IOException bogus) {
       throw new RuntimeException(bogus);
     }
@@ -135,10 +139,26 @@ final class Stemmer {
     return deduped;
   }
   
-  private CharsRef newStem(char buffer[], int length) {
+  private CharsRef newStem(char buffer[], int length, IntsRef forms, int formID) {
+    final String exception;
+    if (dictionary.hasStemExceptions) {
+      int exceptionID = forms.ints[forms.offset + formID + 1];
+      if (exceptionID > 0) {
+        exception = dictionary.getStemException(exceptionID);
+      } else {
+        exception = null;
+      }
+    } else {
+      exception = null;
+    }
+    
     if (dictionary.needsOutputCleaning) {
       scratchSegment.setLength(0);
-      scratchSegment.append(buffer, 0, length);
+      if (exception != null) {
+        scratchSegment.append(exception);
+      } else {
+        scratchSegment.append(buffer, 0, length);
+      }
       try {
         Dictionary.applyMappings(dictionary.oconv, scratchSegment);
       } catch (IOException bogus) {
@@ -148,7 +168,11 @@ final class Stemmer {
       scratchSegment.getChars(0, cleaned.length, cleaned, 0);
       return new CharsRef(cleaned, 0, cleaned.length);
     } else {
-      return new CharsRef(buffer, 0, length);
+      if (exception != null) {
+        return new CharsRef(exception);
+      } else {
+        return new CharsRef(buffer, 0, length);
+      }
     }
   }
 
@@ -387,7 +411,7 @@ final class Stemmer {
 
     IntsRef forms = dictionary.lookupWord(strippedWord, 0, length);
     if (forms != null) {
-      for (int i = 0; i < forms.length; i++) {
+      for (int i = 0; i < forms.length; i += formStep) {
         dictionary.flagLookup.get(forms.ints[forms.offset+i], scratch);
         char wordFlags[] = Dictionary.decodeFlags(scratch);
         if (Dictionary.hasFlag(wordFlags, flag)) {
@@ -413,7 +437,7 @@ final class Stemmer {
               continue;
             }
           }
-          stems.add(newStem(strippedWord, length));
+          stems.add(newStem(strippedWord, length, forms, i));
         }
       }
     }

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java
Sat Jun 21 13:19:02 2014
@@ -25,6 +25,7 @@ import java.util.zip.ZipFile;
 
 import org.apache.lucene.analysis.hunspell.Dictionary;
 import org.apache.lucene.util.LuceneTestCase;
+import org.apache.lucene.util.LuceneTestCase.SuppressSysoutChecks;
 import org.apache.lucene.util.RamUsageTester;
 import org.junit.Ignore;
 
@@ -34,6 +35,7 @@ import org.junit.Ignore;
  * Note some of the files differ only in case. This may be a problem on your operating system!
  */
 @Ignore("enable manually")
+@SuppressSysoutChecks(bugUrl = "prints important memory utilization stats per dictionary")
 public class TestAllDictionaries extends LuceneTestCase {
   
   // set this to the location of where you downloaded all the files
@@ -180,7 +182,7 @@ public class TestAllDictionaries extends
   }
   
   public void testOneDictionary() throws Exception {
-    String toTest = "hu_HU.zip";
+    String toTest = "zu_ZA.zip";
     for (int i = 0; i < tests.length; i++) {
       if (tests[i].equals(toTest)) {
         File f = new File(DICTIONARY_HOME, tests[i]);

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries2.java
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries2.java?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries2.java
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries2.java
Sat Jun 21 13:19:02 2014
@@ -26,6 +26,7 @@ import java.util.zip.ZipFile;
 import org.apache.lucene.analysis.hunspell.Dictionary;
 import org.apache.lucene.util.LuceneTestCase;
 import org.apache.lucene.util.RamUsageTester;
+import org.apache.lucene.util.LuceneTestCase.SuppressSysoutChecks;
 import org.junit.Ignore;
 
 /**
@@ -34,6 +35,7 @@ import org.junit.Ignore;
  * You must click and download every file: sorry!
  */
 @Ignore("enable manually")
+@SuppressSysoutChecks(bugUrl = "prints important memory utilization stats per dictionary")
 public class TestAllDictionaries2 extends LuceneTestCase {
   
   // set this to the location of where you downloaded all the files

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/circumfix.dic
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/circumfix.dic?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/circumfix.dic
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/circumfix.dic
Sat Jun 21 13:19:02 2014
@@ -1,2 +1,2 @@
 1
-nagy/C    [MN]
+nagy/C	[MN]

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/conv.dic
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/conv.dic?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/conv.dic
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/conv.dic
Sat Jun 21 13:19:02 2014
@@ -1,2 +1,2 @@
 1
-drink/X   [VERB]
+drink/X	[VERB]

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/dependencies.dic
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/dependencies.dic?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/dependencies.dic
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/dependencies.dic
Sat Jun 21 13:19:02 2014
@@ -1,3 +1,3 @@
 2
-drink/RQ  [verb]
-drink/S   [noun]
+drink/RQ	[verb]
+drink/S	[noun]

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/homonyms.dic
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/homonyms.dic?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/homonyms.dic
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/homonyms.dic
Sat Jun 21 13:19:02 2014
@@ -1,3 +1,3 @@
 2
-work/A    [VERB]
-work/B    [NOUN]
\ No newline at end of file
+work/A	[VERB]
+work/B	[NOUN]
\ No newline at end of file

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/ignore.dic
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/ignore.dic?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/ignore.dic
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/ignore.dic
Sat Jun 21 13:19:02 2014
@@ -1,3 +1,3 @@
 1
-drink/X   [VERB]
-dr-ank/X  [VERB]
\ No newline at end of file
+drink/X	[VERB]
+dr-ank/X	[VERB]
\ No newline at end of file

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morph.dic
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morph.dic?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morph.dic
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/morph.dic
Sat Jun 21 13:19:02 2014
@@ -1,2 +1,2 @@
 1
-drink/X   [VERB]
+drink/X	[VERB]

Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/twosuffixes.dic
URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/twosuffixes.dic?rev=1604355&r1=1604354&r2=1604355&view=diff
==============================================================================
--- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/twosuffixes.dic
(original)
+++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/twosuffixes.dic
Sat Jun 21 13:19:02 2014
@@ -1,2 +1,2 @@
 1
-drink/X   [VERB]
+drink/X	[VERB]



Mime
View raw message