lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "AnalyzersTokenizersTokenFilters" by ShalinMangar
Date Tue, 07 Jul 2009 18:42:43 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by ShalinMangar:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

The comment on the change is:
Added note on preserveOriginal and splitOnNumerics

------------------------------------------------------------------------------
   * '''splitOnCaseChange="1"''' causes lowercase => uppercase transitions to generate
a new part [Solr 1.3]:
     * `"PowerShot" => "Power" "Shot"`
     * `"TransAM" => "Trans" "AM"`
+  * '''splitOnNumerics="1"''' causes alphabet => number transitions to generate a new
part [Solr 1.3]:
+    * `"j2se" => "j" "2" "se"`
  Note that this is the default behaviour in all released versions of Solr.
  
  There are also a number of parameters that affect what tokens are present in the final output
and if subwords are combined:
@@ -372, +374 @@

     * `"500-42" => "50042"`
   * '''catenateAll="1"''' causes all subword parts to be catenated:
     * `"wi-fi-4000" => "wifi4000"`
+  * '''preserveOriginal="1"''' causes the original token to be indexed without modifications
(in addition to the tokens produced due to other options)
  
  These parameters may be combined in any way.
   * Example of generateWordParts="1" and  catenateWords="1":
@@ -391, +394 @@

                  catenateWords="0"
                  catenateNumbers="0"
                  catenateAll="0"
+                 preserveOriginal="1"
                  />
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StopFilterFactory"/>
@@ -404, +408 @@

                  catenateWords="1"
                  catenateNumbers="1"
                  catenateAll="0"
+                 preserveOriginal="1"
                  />
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StopFilterFactory"/>

Mime
View raw message