lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gusenbauer Stefan <gusenba...@eduhi.at>
Subject Re: Confused about non-tokenized fields
Date Fri, 27 May 2005 16:14:33 GMT
Max Pfingsthorn wrote:

>Hi!
>
>Thanks for the reply. I figured already that fields are actually not tokenized... I lost
track of the filenames/dirnames and there were some duplicates...
>
>About case-insensitivity: Okay, I can make my query lower case, but my strings in the
field are not... I guess I have to do that manually during indexing? Or is there some nicer
way?
>  
>
I think this is not a problem. This should be done automatically when
you make a case insensitiv search so that you don't have to think about
it. If it should become a problem write another email *g*
Stefan

>Thanks!
>Max Pfingsthorn
>
>-----Original Message-----
>From: Gusenbauer Stefan [mailto:gusenbauer@eduhi.at]
>Sent: Friday, May 27, 2005 18:00
>To: java-user@lucene.apache.org
>Subject: Re: Confused about non-tokenized fields
>
>
>Max Pfingsthorn wrote:
>
>  
>
>>Hi!
>>
>>In my application, I index some strings (like filenames) untokenized, meaning via
>>
>>doc.add(new Field(FIELD,VALUE,false,true,false));
>>
>>When I later take a look at it with Luke, I still get tokens of the filenames (like
"news" instead of "news-item.xml") in the list of most frequent terms. Shouldn't I get only
the complete filenames there??
>>
>>Also, how do I search case-insensitive over this kind of field?
>>
>>Thanks!
>>
>>Best regards,
>>
>>Max Pfingsthorn
>>
>>Hippo  
>>
>>Oosteinde 11
>>1017WT Amsterdam
>>The Netherlands
>>Tel  +31 (0)20 5224466
>>-------------------------------------------------------------
>>m.pfingsthorn@hippo.nl / www.hippo.nl
>>--------------------------------------------------------------
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>> 
>>
>>    
>>
>For indexing untokenized fields try the static method
>Field.Keyword(String fieldname,String value) then the string is really
>not tokenized. But i think new Field with your params should make the
>same. Have you tried to make a search for the filename this should only
>return a result when you write out the whole filename.
>
>Case insensitive search is standard when you use the standardanalyzer i
>think:
>the code should look like this
>Searcher.search(QueryParser.parse("the query string","the fieldname",new
>StandardAnalyzer());
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message