From user-return-8817-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Wed Feb 17 15:30:30 2010 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 50421 invoked from network); 17 Feb 2010 15:30:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Feb 2010 15:30:30 -0000 Received: (qmail 88795 invoked by uid 500); 17 Feb 2010 15:30:29 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 88708 invoked by uid 500); 17 Feb 2010 15:30:29 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 88698 invoked by uid 99); 17 Feb 2010 15:30:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2010 15:30:29 +0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [195.85.130.100] (HELO smtp.atwork.nl) (195.85.130.100) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2010 15:30:18 +0000 Received: from localhost (localhost [127.0.0.1]) by smtp.atwork.nl (Postfix) with ESMTP id A178613F74D for ; Wed, 17 Feb 2010 16:29:45 +0100 (CET) Received: from waldo.buyways.nl (atwork-154.r-212.178.116.atwork.nl [212.178.116.154]) by smtp.atwork.nl (Postfix) with ESMTP id EF10813F78C for ; Wed, 17 Feb 2010 16:29:09 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by waldo.buyways.nl (Postfix) with ESMTP id 3BB233A1693 for ; Wed, 17 Feb 2010 16:29:09 +0100 (CET) Received: from waldo.buyways.nl ([127.0.0.1]) by localhost (waldo.buyways.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CI1eoE5Jg4IX for ; Wed, 17 Feb 2010 16:29:09 +0100 (CET) Received: from zealand.localnet (beauregard.buyways.nl [192.168.217.1]) by waldo.buyways.nl (Postfix) with ESMTP id 19B7C3A168F for ; Wed, 17 Feb 2010 16:29:09 +0100 (CET) From: Markus Jelsma Reply-To: markus@buyways.nl Organization: Buyways To: user@couchdb.apache.org Subject: Re: couchdb-lucene : stemming analyzer configuration question Date: Wed, 17 Feb 2010 16:29:07 +0100 User-Agent: KMail/1.12.2 (Linux/2.6.31-19-generic; KDE/4.3.2; i686; ; ) References: <904347.85420.qm@web45310.mail.sp1.yahoo.com> In-Reply-To: <904347.85420.qm@web45310.mail.sp1.yahoo.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201002171629.07947.markus@buyways.nl> X-Virus-Scanned: by @Work Hi, It's per the documentation quite unclear which tokenizers are being used for each analyzer. However, the readme states that the standardanalyzer only uses the LowerCaseFilterFactory, StopFilterFactory but there is no tokenizer mentioned. I assume that it uses a simple WhiteSpaceTokenizer which does not with grams (substrings). You can either query using a wildcard, which is supported AFAIK, or make an attempt to specifiy your own tokenizer, perhaps creating a custom analyzer. Either way, searching for grams can be done using and NGramTokenizer. Cheers, >Hello, >I am trying to support fulltext search with CouchDB-Lucene. > >I am using CouchDB 0.10.0 and couchdb-lucene 0.4 on Windows XP. > >I am able to query a word, but not able to match partial word. For example, > I have a 'name' field with a value 'alex'. I can query the documents if I > use 'q=alex'. But I am not able to get any documents if I use 'q=a'. > >I suspect that this is because the default StandardAnalyzer does not support > this. How should I config a different analyzer to support this ? rgds, >canal Markus Jelsma - Technisch Architect - Buyways BV http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350