Return-Path: X-Original-To: apmail-lucene-general-archive@www.apache.org Delivered-To: apmail-lucene-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 41968D725 for ; Fri, 21 Dec 2012 07:27:48 +0000 (UTC) Received: (qmail 77652 invoked by uid 500); 21 Dec 2012 07:27:47 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 77523 invoked by uid 500); 21 Dec 2012 07:27:47 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 77510 invoked by uid 99); 21 Dec 2012 07:27:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Dec 2012 07:27:47 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of davidshen84@gmail.com designates 209.85.212.46 as permitted sender) Received: from [209.85.212.46] (HELO mail-vb0-f46.google.com) (209.85.212.46) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Dec 2012 07:27:41 +0000 Received: by mail-vb0-f46.google.com with SMTP id b13so4665343vby.19 for ; Thu, 20 Dec 2012 23:27:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=AR6uSkZ/UZIXVExju/+EH9rsjN182NtoDRD3R7Z5Cac=; b=n/85c99X7CBBVeAQXeAoJOCPyt77iedawk8hUmboIZf2ELMOagBpNcPjraUJOO6C04 M7bpxs4Upo2C3j2XMcSWP3DEJ65ULoQaMW/f2wyQItUdkmvzCYubO6It/0h5ftpf/8VY PMi4isDGv6fNzUcexpfo3puPDaYx/GJXYQDGYuFwJ/obhSn4eDH+uFBOCmwh3j7iZ549 cHiYINS+lkdSyXrHa2EuDPMRjRFkPuBjAGP4w+ksrVT7mOGj7RL5k8oRiFnL5joOaFOX R3LeG67MHHbzViEYBoFCdpQouEu2McUOmihW0r7YfMg2dauZdioWViTgDMHSPeCqX0lr 6yEg== MIME-Version: 1.0 Received: by 10.52.36.206 with SMTP id s14mr16122958vdj.93.1356074840376; Thu, 20 Dec 2012 23:27:20 -0800 (PST) Received: by 10.58.66.198 with HTTP; Thu, 20 Dec 2012 23:27:20 -0800 (PST) In-Reply-To: References: Date: Fri, 21 Dec 2012 15:27:20 +0800 Message-ID: Subject: Fwd: Which token filter can combine 2 terms into 1? From: Xi Shen To: general@lucene.apache.org Content-Type: multipart/alternative; boundary=20cf3079b68226434104d157c731 X-Virus-Checked: Checked by ClamAV on apache.org --20cf3079b68226434104d157c731 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I am looking for a token filter that can combine 2 terms into 1? E.g. the input has been tokenized by white space: t1 t2 t2a t3 I want a filter that output: t1 t2t2a t3 I know it is a very special case, and I am thinking about develop a filter of my own. But I cannot figure out which API I should use to look for terms in a Token Stream. --=20 Regards=EF=BC=8C David Shen http://about.me/davidshen https://twitter.com/#!/davidshen84 --20cf3079b68226434104d157c731--