Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 25346 invoked from network); 4 Feb 2011 20:34:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Feb 2011 20:34:54 -0000 Received: (qmail 35854 invoked by uid 500); 4 Feb 2011 20:34:52 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 35797 invoked by uid 500); 4 Feb 2011 20:34:51 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 35789 invoked by uid 99); 4 Feb 2011 20:34:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Feb 2011 20:34:51 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [141.211.3.201] (HELO itcs-ehub-01.adsroot.itcs.umich.edu) (141.211.3.201) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Feb 2011 20:34:43 +0000 Received: from ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu ([141.211.3.232]) by itcs-ehub-01.adsroot.itcs.umich.edu ([141.211.3.201]) with mapi; Fri, 4 Feb 2011 15:34:23 -0500 From: "Burton-West, Tom" To: "java-user@lucene.apache.org" Date: Fri, 4 Feb 2011 15:34:21 -0500 Subject: RE: Bigrams for CJK with ICUTokenizer ? Thread-Topic: Bigrams for CJK with ICUTokenizer ? Thread-Index: AcvEqOt3De6QNK0RTvyLUx1U/t8LlQAAZh1Q Message-ID: <47316FE3F6BA0D4DADF99663512552680AF3BC0611@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> References: <47316FE3F6BA0D4DADF99663512552680AF3BC0569@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> <47316FE3F6BA0D4DADF99663512552680AF3BC0601@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org VGhhbmtzIFJvYmVydCwNCg0KSSBvcGVuZWQgdXAgTFVDRU5FIDI5MDYuIEJ1dCBJIGp1c3QgcmVh bGl6ZWQgaW4gdGhlIGVmZm9ydCB0byBrZWVwIHRoZSBkZXNjcmlwdGlvbiBzaG9ydCwgSSBmb3Jn b3QgdG8gaW5jbHVkZSB5b3VyIG9wdGlvbiBvZiBwcm9kdWNpbmcgYm90aCB1bmlncmFtcyBhbmQg YmlncmFtcywgd2hpY2ggaXMgYSBuaWNlIG9wdGlvbi4NCg0KVG9tDQoNCg0KLS0tLS1PcmlnaW5h bCBNZXNzYWdlLS0tLS0NCkZyb206IFJvYmVydCBNdWlyIFttYWlsdG86cmNtdWlyQGdtYWlsLmNv bV0gDQpTZW50OiBGcmlkYXksIEZlYnJ1YXJ5IDA0LCAyMDExIDM6MTkgUE0NClRvOiBqYXZhLXVz ZXJAbHVjZW5lLmFwYWNoZS5vcmcNClN1YmplY3Q6IFJlOiBCaWdyYW1zIGZvciBDSksgd2l0aCBJ Q1VUb2tlbml6ZXIgPw0KDQpPbiBGcmksIEZlYiA0LCAyMDExIGF0IDM6MDcgUE0sIEJ1cnRvbi1X ZXN0LCBUb20gPHRidXJ0b253QHVtaWNoLmVkdT4gd3JvdGU6DQo+IFRoYW5rcyBSb2JlcnQsDQo+ DQo+IEx1Y2VuZSAyNzQwIGxvb2tzIHJlYWxseSBpbnRlcmVzdGluZy4gwqBJbiB0aGUgbWVhbnRp bWUgYSBKSVJBIGlzc3VlIGZvciB0aGlzIHNvdW5kcyBsaWtlIGEgZ29vZCBpZGVhIHNpbmNlIEkn bSBndWVzc2luZyBvdGhlciBwZW9wbGUgd291bGQgbGlrZSB0byB1c2UgdGhlIElDVVRva2VuaXpl ciBidXQgd291bGQgYWxzbyBsaWtlIGJpZ3JhbXMgZm9yIENKSy4NCj4NCj4gSSdtIGEgYml0IGNv bmZ1c2VkIG92ZXIgdGhlIHJlbGF0aW9uc2hpcCBvZiB0aGUgcXVlcnlwYXJzZXIgdG8gdGhlIGZp bHRlciBjaGFpbiBhbmQgd2hldGhlciBhIGZpbHRlciBpbiB0aGUgY2hhaW4gYWZ0ZXIgdGhlIElD VVRva2VuaXplciBjb3VsZCBjb25zdHJ1Y3QgYmlncmFtcyBpZiB0aGUgSUNVVG9rZW5pemVyIGlz IHNwaXR0aW5nIG91dCB1bmlncmFtcyBhbmQgdGhlIHF1ZXJ5cGFyc2VyIGlzIHRoZW4gY29udmVy dGluZyB0aGUgdW5pZ3JhbXMgdG8gYSBCb29sZWFuIGNsYXVzZXMgKGkuZS4gYXV0b0dlbmVyYXRl UGhyYXNlUXVlcmllcz1mYWxzZS4pDQoNCnRoZSBRUCBvbmx5IHNlZXMgdHdvIHRoaW5nczoNCjEu IHRoZSBpbnB1dCBzdHJpbmcsIHdoaWNoIGl0IHBhcnNlcyBiZWZvcmUgdGhlIGFuYWx5emVyDQoy LiB0aGUgcmVzdWx0IG9mIHRoZSBlbnRpcmUgYW5hbHl6ZXIgKHRva2VuaXplciBhbmQgYWxsIGZp bHRlcnMpLg0KDQpTbyBpbiB0aGlzIGNhc2UsIG9ubHkgIzIgd291bGQgYmUgZGlmZmVyZW50LCBh cyB0aGUgZW50aXJlIGFuYWx5emVyDQp3b3VsZCBvdXRwdXQgQUIsIEJDIGluc3RlYWQgb2YgQSwg QiwgQw0KV2l0aCB5b3VyIHNldHRpbmdzLCBmb3IgYW4gaW5wdXQgb2YgQUJDLCB5b3Ugd2lsbCBn ZXQgYSByZWd1bGFyDQpib29sZWFuIHF1ZXJ5IHdpdGggQUIsIEJDLg0KSWYgdGhlIHVzZXIgcHV0 cyAiQUJDIiBpbiBxdW90ZXMgdGhvdWdoLCB5b3Ugd2lsbCBnZXQgYSBwaHJhc2UgcXVlcnkgb2Yg IkFCIEJDIg0KDQo+DQo+IElmIEFCQyBpcyBhIHN0cmluZyBvZiBIYW4gY2hhcmFjdGVycyBhbmQg dGhlIElDVVRva2VuaXplciBzcGl0IG91dCB1bmlncmFtcyBBIEIgQyDCoChhbmQgd2UgaGF2ZSBh dXRvR2VuZXJhdGVQaHJhc2VRdWVyaWVzIHNldCB0byBmYWxzZSkgd29uJ3QgdGhlIG5leHQgZmls dGVyIGluIHRoZSBjaGFpbiBnZXQgZWFjaCBvZiB0aGUgdW5pZ3JhbXMgaW4gYSBCb29sZWFuIGNs YXVzZSBvbmUgYXQgYSB0aW1lPyDCoEkgZ3Vlc3MgSSBkb24ndCBzZWUgaG93IHRoZSBuZXh0IGZp bHRlciBpbiB0aGUgY2hhaW4gY2FuIHJlYXNzZW1ibGUgdGhlIHVuaWdyYW1zIGludG8gb3Zlcmxh cHBpbmcgYmlncmFtcy4gwqAgTWF5YmUgSSdtIG5vdCB1bmRlcnN0YW5kaW5nIGhvdyB0b2tlbnMg Z2V0IHBhc3NlZCBmcm9tIG9uZSBmaWx0ZXIgdG8gdGhlIG5leHQgd2hlbiBvbmUgb2YgdGhlIGZp bHRlcnMgKG9yIGluIHRoaXMgY2FzZSB0aGUgdG9rZW5pemVyKSBicmVha3MgYSB0b2tlbiB1cCBp bnRvIG11bHRpcGxlIHRva2Vucy4NCg0KSW4gdGhpcyBjYXNlIGl0IHdvcmtzIGp1c3QgbGlrZSBh IHNlbGVjdGl2ZSBzaGluZ2xlZmlsdGVyPw0KDQo+DQo+IE9yIGFtIEkgZ2V0dGluZyBpbmRleCB0 aW1lIGFuYWx5c2lzIGNvbmZ1c2VkIHdpdGggcXVlcnkgdGltZSBhbmFseXNpcz8NCj4gRGlkIHlv dSBtZWFuIHRoYXQgSUNVVG9rZW5pemVyIGNvdWxkIGJlIG1vZGlmaWVkIHRvIG91dHB1dCBiaWdy YW1zIMKgb3IgdGhhdCBhIGZpbHRlciBjb3VsZCBiZSBkZXNpZ25lZCB0aGF0IHdvdWxkIHRha2Ug dGhlIG91dHB1dCBvZiB0aGUgSUNVVG9rZW5pemVyIGFuZCBjcmVhdGUgc2hpbmdsZXMgb24gdG9r ZW5zIHdpdGggdGhlIGF0dHJpYnV0ZSBmb3IgSGFuPw0KPg0KDQpJIHRoaW5rIHRoZSBsYXR0ZXIu IHRoaXMgd2F5LCB3ZSBjYW4gcHJvdmlkZSB0aGUgbW9zdCBvcHRpb25zOiB1bmlncmFtDQood2hh dCBpdCBkb2VzIGJ5IGRlZmF1bHQ6IEEsQixDKSwgYnV0IGFsc28gZmlsdGVycyBmb3IgYmlncmFt IChBQiBCQyksDQpvciB1bmliaWdyYW0gIChBLCBBQiwgQiwgQkMsIEMpDQpUaGlzIGlzIHdoeSBp IHNhaWQsIHdlIGNhbiBtYWtlIHRoZXNlIGZpbHRlcnMgZXhwZXJpbWVudGFsIGZvciBub3csDQpi ZWNhdXNlIGlkZWFsbHkgYXQgc29tZSBwb2ludCB5b3Ugd2lsbCBiZSBhYmxlIHRvIHVzZSBzaGlu Z2xlZmlsdGVyDQoiY29uZGl0aW9uYWxseSIgb3ZlciB0aGUgU2NyaXB0QXR0cmlidXRlIGZvciB0 aGVzZSB1c2UtY2FzZXMsIHdpdGhvdXQNCmhhdmluZyB0byBoYXZlIGEgc3BlY2lhbCBmaWx0ZXIu DQoNCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLQ0KVG8gdW5zdWJzY3JpYmUsIGUtbWFpbDogamF2YS11c2VyLXVuc3Vi c2NyaWJlQGx1Y2VuZS5hcGFjaGUub3JnDQpGb3IgYWRkaXRpb25hbCBjb21tYW5kcywgZS1tYWls OiBqYXZhLXVzZXItaGVscEBsdWNlbmUuYXBhY2hlLm9yZw0KDQo=