Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 35228 invoked from network); 20 Nov 2001 22:15:01 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 20 Nov 2001 22:15:01 -0000 Received: (qmail 311 invoked by uid 97); 20 Nov 2001 22:15:04 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@jakarta.apache.org Received: (qmail 295 invoked by uid 97); 20 Nov 2001 22:15:03 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 284 invoked from network); 20 Nov 2001 22:15:03 -0000 Message-ID: From: "New, Cecil (GEAE)" To: "'Lucene Users List'" Subject: RE: Attribute Search Date: Tue, 20 Nov 2001 17:18:32 -0500 X-Mailer: Internet Mail Service (5.5.2653.19) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Actually, "last name" is not a good example. Social security numbers, phone numbers, PO numbers, organization codes, etc. are better examples. These fields are not even text. So I did not think it made sense to tokenize them. But I did want them indexed and searchable. -----Original Message----- From: Emmanuel Bridonneau [mailto:EBridonneau@epicentric.com] Sent: Monday, November 19, 2001 10:02 PM To: 'Lucene Users List' Subject: RE: Attribute Search I am new here too but here's my 2 cents. If you don't tokenize your db textvalues, what do you say will be the resulting terms indexed? I think not what you expect. Your non tokenized fields probably are not filtered out hence a lastname like 'Smith' will not be a hit if the query is 'smith' the search being case sensitive. I last name is "smith B" (middle initial), search on 'smith' won't return either because not a token. I suggest you double check your values in your DB especially if DB is case sensitive. Does your analyser takes into account the accent if Latin type of locale? -----Original Message----- From: Cecil, Paula New [mailto:cnew@fuse.net] Sent: Monday, November 19, 2001 9:47 PM To: LUCENE Text Search Subject: Attribute Search I am trying index a set of data, storing only a "primary key". This primary key I left un-indexed. There is one "text" field, that I indexed and tokenized. The others I neither want to store or tokenized. My reasoning was that "not tokenizing" would produce the smallest index. The remaining fields were lastname, firstname, etc. However, my queries did not work correctly; never returning any hits. I finally gave up and re-indexed with Tokenize set to true on all the fields. Now my queries work. And to my surprise, the index was smaller that when I did not tokenize. I found this a little counter-intuitive. Can someone explain this? -- To unsubscribe, e-mail: For additional commands, e-mail: -- To unsubscribe, e-mail: For additional commands, e-mail: