On Mar 31, 2005, at 10:49 PM, pashupathinath wrote: > i should do even more analysis as suggested by you > before i should come to a decision of which analyser i > should be using to solve this. what about writing a > custom analyzer to solve this ??? how can i go abt the > logic of implementing this in a custom analyzer.. > where this returns all the documents that has even a > part of the search string. > any insight into this would be very helpful > especially in terms of performance wise. This is an involved topic, and one that is covered in great detail in the analysis chapter of Lucene in Action (shameless plug, yes, I know!). I recommend you analyze the types of queries that need to be made and what type of user interface you will present for this - then determine what makes the most sense analysis-wise. WhitespaceAnalyzer is not going to be good enough, as I suspect you'll want case-insensitive searches at least. Erik > > thanks, > pashupathinath.k > > --- Erik Hatcher wrote: >> >> On Mar 31, 2005, at 11:44 AM, pashupathinath wrote: >> >>> is it possible to index using a predefined >> analyzer >>> and search using a custom analyzer ?? >> >> Yes, its perfectly fine to do so with the caveat >> that you end up >> searching for the terms exactly as they were >> indexed. >> >> I end up doing this in most applications, actually, >> primarily because >> untokenized fields need to use the KeywordAnalyzer >> during searching. >> >>> i'm searching using the built in whitespace >>> analyser. the problem is when i'm searching for a >> part >>> of a string the search results are zero. >>> i'm using white space analyzer. for example if >> the >>> statement is "my name is abc123" the search for >> abc or >>> 123 doesnt return any hits. >>> anyinsight into this ?? >> >> The exact terms indexed using WhitespaceAnalyzer are >> like this (using >> the Lucene in Action AnalyzerDemo - "ant >> AnalyzerDemo"): >> >> [input] String to analyze: [This string will be >> analyzed.] >> my name is abc123 >> [echo] Running lia.analysis.AnalyzerDemo... >> [java] Analyzing "my name is abc123" >> [java] WhitespaceAnalyzer: >> [java] [my] [name] [is] [abc123] >> >> [java] SimpleAnalyzer: >> [java] [my] [name] [is] [abc] >> >> [java] StopAnalyzer: >> [java] [my] [name] [abc] >> >> [java] StandardAnalyzer: >> [java] [my] [name] [abc123] >> >> So you indexed "abc123" and searches must search for >> that term >> *exactly*. You can search for "abc*" as a >> PrefixQuery or WildcardQuery >> and find "abc123". "*123" will also find it though >> QueryParser does >> not support leading wildcard characters (but the API >> does). Wildcard >> queries are not ideally what you want as it tends to >> be much slower for >> large indexes. >> >> You may need to do specialized analysis. Perhaps >> you could share you >> real needs with the list and we could offer >> recommendations. It is >> possible to index "abc123", "abc", and "123" all >> within the same >> position in the index if you do some clever analysis >> and that meshes >> with what you're after. >> >> Erik >> >> >> > --------------------------------------------------------------------- >> To unsubscribe, e-mail: >> java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: >> java-user-help@lucene.apache.org >> >> > > Send instant messages to your online friends > http://uk.messenger.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org