Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 15358 invoked from network); 17 Dec 2007 17:29:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Dec 2007 17:29:35 -0000 Received: (qmail 12402 invoked by uid 500); 17 Dec 2007 17:29:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 12366 invoked by uid 500); 17 Dec 2007 17:29:18 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 12355 invoked by uid 99); 17 Dec 2007 17:29:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Dec 2007 09:29:18 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mike.klaas@gmail.com designates 64.233.166.181 as permitted sender) Received: from [64.233.166.181] (HELO py-out-1112.google.com) (64.233.166.181) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Dec 2007 17:28:55 +0000 Received: by py-out-1112.google.com with SMTP id d32so10355331pye.12 for ; Mon, 17 Dec 2007 09:28:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:mime-version:in-reply-to:references:content-type:message-id:content-transfer-encoding:from:subject:date:to:x-mailer; bh=HFhTlQuCAkL43oBxhIgGYblB9rpskP40GKZd9CidlCU=; b=GhvAcP/NzRJY8b2oQ4kJxNzb5VQG96saEs70QZy7LfTvqaiIgXJkW4WXKiIx4pGpP2VgzPC9GqBRSbhOUkHA2FSyelIviZrvdt/1TRmA7MAAjVc7ypO6kuXAMhdhgc9n5ooAFLhrYTOqMOd4l25vNtBLLkGt/+RvQtD5mKnhOnQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:content-type:message-id:content-transfer-encoding:from:subject:date:to:x-mailer; b=GjHGrFhlxGnnPEriMEjNm6pD5M/ZOixknGQSdabtLj7xyRnP8PRZxgNdXEnU3RWJLttC5gw0k3Gu/m2sv8QeZF+Za62BBuHvinTg7cpwjTwL5mPPWSjSPb8TanHHkyRQkLEozTFjIzpFAzseRUa1/NdM5G5qEKA2XBkvc52gEug= Received: by 10.65.231.20 with SMTP id i20mr15328117qbr.78.1197912537550; Mon, 17 Dec 2007 09:28:57 -0800 (PST) Received: from ?192.168.1.104? ( [24.215.75.34]) by mx.google.com with ESMTPS id e19sm4805817qba.2007.12.17.09.28.52 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 17 Dec 2007 09:28:53 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <99833356D526854CAF8ECF4D7C769D49C78B9D@MAIL05.northamerica.cerner.net> References: <99833356D526854CAF8ECF4D7C769D49C78B9D@MAIL05.northamerica.cerner.net> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <2789FED1-21C0-4D38-941D-FE52DF4703B1@gmail.com> Content-Transfer-Encoding: 7bit From: Mike Klaas Subject: Re: thoughts/suggestions for analyzing/tokenizing class names Date: Mon, 17 Dec 2007 09:28:49 -0800 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org On 15-Dec-07, at 3:14 PM, Beyer,Nathan wrote: > I have a few fields that use package names and class names and I've > been > looking for some suggestions for analyzing these fields. > > A few examples - > > Text (class name) > - "org.apache.lucene.document.Document" > Queries that would match > - "org.apache" , "org.apache.lucene.document" > > Text (class name + method signature) > -- "org.apache.lucene.document.Document#add(Fieldable)" > Queries that would match > -- "org.apache.lucene", "org.apache.lucene.document.Document#add" > > Any thoughts on how to approach tokenizing these types of texts? Perhaps it would help to include some examples of queries you _don't_ want to match. For all the examples above, simply tokenizing alphanumeric components would suffice. -Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org