Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 72837 invoked from network); 17 Apr 2006 17:00:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 17 Apr 2006 17:00:55 -0000 Received: (qmail 59854 invoked by uid 500); 17 Apr 2006 17:00:50 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 59734 invoked by uid 500); 17 Apr 2006 17:00:49 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 59723 invoked by uid 99); 17 Apr 2006 17:00:49 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Apr 2006 10:00:49 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [66.105.41.166] (HELO bettylou.configureone.net) (66.105.41.166) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Apr 2006 10:00:48 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C66240.42C8EE32" Subject: hypens Date: Mon, 17 Apr 2006 11:59:15 -0500 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: hypens Thread-Index: AcZiQEIXZ7AKJQo3Q8CZUE9AXvYSUA== From: "John Powers" To: X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C66240.42C8EE32 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello, =20 If I have a user search for "b-trunk" I would like them to be able to find "b-trunk" (with hypen). I would also like someone searching for "b trunk" to also find "b-trunk". =20 On the other side, if someone searches for 12412 I would like them to be able to find 12412-235, 12412-121, 12412-etc... as well as letting someone type in 12412-235 directly and get a good result list: the one item would be best, but a larger list with that one on top is good too. =20 So for now I am using the standardanalyzer. I do a search for what they give me in quotes on all fields as well as the same thing w/o quotes. When I print out the final query the half of the overall query in quotes seems to have the hypens stripped out, but the w/o quotes version doesn't...so this lets me find what I want. But I have each search phrase in the final query twice now. it seems to work fine, but it seems pretty inelegant--unelegant even. =20 =20 It seems like I can't just strip out the hypens, nor keep them. I am storing the name as keyword, but everything else as Text. I thought that would matter but a description or keyword or other field may have something like "this also relates to 23523-235" so if someone was searching for 23523 I would also want this in the list... and if they searched for the 23523-235 then I would also want this still. So I don't know if its solvable by the type of field I use to index it. Or do I have to store each field twice with different analyzer? That seems just as clumsy as my double-search solution. =20 =20 Any thoughts? =20 ------_=_NextPart_001_01C66240.42C8EE32--