lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Bridonneau <EBridonn...@epicentric.com>
Subject RE: Attribute Search
Date Wed, 21 Nov 2001 00:12:07 GMT
Did you check w/FAQ 26 on searching?
http://www.lucene.com/cgi-bin/faq/faqmanager.cgi?file=chapter.search&toc=faq
#q26


-----Original Message-----
From: New, Cecil (GEAE) [mailto:cecil.new@ae.ge.com]
Sent: Tuesday, November 20, 2001 2:19 PM
To: 'Lucene Users List'
Subject: RE: Attribute Search


Actually, "last name" is not a good example.  Social security numbers, phone
numbers, PO numbers, organization codes, etc. are better examples.

These fields are not even text.  So I did not think it made sense to
tokenize them.  But I did want them indexed and searchable.

-----Original Message-----
From: Emmanuel Bridonneau [mailto:EBridonneau@epicentric.com]
Sent: Monday, November 19, 2001 10:02 PM
To: 'Lucene Users List'
Subject: RE: Attribute Search


I am new here too but here's my 2 cents.
If you don't tokenize your db textvalues, what do you say will be the
resulting terms indexed? I think not what you expect.
Your non tokenized fields probably are not filtered out hence a lastname
like 'Smith' will not be a hit if the query is 'smith' the search being case
sensitive.
I last name is "smith B" (middle initial), search on 'smith' won't return
either because not a token.
I suggest you double check your values in your DB especially if DB is case
sensitive.
Does your analyser takes into account the accent if Latin type of locale?


-----Original Message-----
From: Cecil, Paula New [mailto:cnew@fuse.net]
Sent: Monday, November 19, 2001 9:47 PM
To: LUCENE Text Search
Subject: Attribute Search


I am trying index a set of data, storing only a "primary key".  This primary
key I left un-indexed.  There is one "text" field, that I indexed and
tokenized.

The others I neither want to store or tokenized.  My reasoning was that "not
tokenizing" would produce the smallest index.  The remaining fields were
lastname, firstname, etc.

However, my queries did not work correctly; never returning any hits.

I finally gave up and re-indexed with Tokenize set to true on all the
fields.

Now my queries work.  And to my surprise, the index was smaller that when I
did not tokenize.

I found this a little counter-intuitive.  

Can someone explain this?

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message