lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hoss...@apache.org
Subject svn commit: r810247 - in /lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart: hhmm/package.html package.html
Date Tue, 01 Sep 2009 21:31:18 GMT
Author: hossman
Date: Tue Sep  1 21:31:18 2009
New Revision: 810247

URL: http://svn.apache.org/viewvc?rev=810247&view=rev
Log:
LUCENE-1882: improved package level docs for smartcn

Modified:
    lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/package.html
    lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html

Modified: lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/package.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/package.html?rev=810247&r1=810246&r2=810247&view=diff
==============================================================================
--- lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/package.html
(original)
+++ lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/package.html
Tue Sep  1 21:31:18 2009
@@ -15,14 +15,16 @@
  See the License for the specific language governing permissions and
  limitations under the License.
 -->
-<html><head></head>
+<html><head>
+<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
+</head>
 <body>
 <div>
-SmartChineseAnalyzer Hidden Markov Model package
+SmartChineseAnalyzer Hidden Markov Model package.
 </div>
 <div>
 <font color="#FF0000">
-WARNING: The status of the analyzers/smartcn <b>analysis.cn</b> package is experimental.
The APIs
+WARNING: The status of the analyzers/smartcn <b>analysis.cn.smart</b> package
is experimental. The APIs
 and file formats introduced here might change in the future and will not be supported anymore
 in such a case.
 </font>

Modified: lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html?rev=810247&r1=810246&r2=810247&view=diff
==============================================================================
--- lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html
(original)
+++ lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/package.html
Tue Sep  1 21:31:18 2009
@@ -15,17 +15,36 @@
  See the License for the specific language governing permissions and
  limitations under the License.
 -->
-<html><head></head>
+<html>
+<head>
+<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
+</head>
 <body>
 <div>
-SmartChineseAnalyzer Tokenizers and TokenFilters
+Analyzer for Simplified Chinese, which indexes words.
 </div>
 <div>
 <font color="#FF0000">
-WARNING: The status of the analyzers/smartcn <b>analysis.cn</b> package is experimental.
The APIs
+WARNING: The status of the analyzers/smartcn <b>analysis.cn.smart</b> package
is experimental. The APIs
 and file formats introduced here might change in the future and will not be supported anymore
 in such a case.
 </font>
 </div>
+<div>
+Three analyzers are provided for Chinese, each of which treats Chinese text in a different
way.
+<ul>
+	<li>ChineseAnalyzer (in the analyzers/cn package): Index unigrams (individual Chinese
characters) as a token.
+	<li>CJKAnalyzer (in the analyzers/cjk package): Index bigrams (overlapping groups
of two adjacent Chinese characters) as tokens.
+	<li>SmartChineseAnalyzer (in this package): Index words (attempt to segment Chinese
text into words) as tokens.
+</ul>
+
+Example phrase: "我是中国人"
+<ol>
+	<li>ChineseAnalyzer: 我-是-中-国-人</li>
+	<li>CJKAnalyzer: 我是-是中-中国-国人</li>
+	<li>SmartChineseAnalyzer: 我-是-中国-人</li>
+</ol>
+</div>
+
 </body>
 </html>



Mime
View raw message