lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: zero termfreq for some search strings with special characters
Date Wed, 20 Jun 2007 13:23:31 GMT
You don't. You don't have an actual term "emp-id" in your index. You
have "emp" and "id". So "emp-id" isn't a term.

If you really want to control this sort of thing, and none of the
stock analyzers work exactly as you require, you need to write
your own Analyzer that breaks the stream however you want, and
use *that* analyzer at index and search time. Then looking at
termfreq will work as you expect.

PerFieldAnalyzerWrapper will allow you to treat different fields
differently, which may help if you want one sort of behavior for
one field in your documents and different behavior for others.

Best
Erick

On 6/20/07, SK R <rsk.sen@gmail.com> wrote:
>
> Hi,
>      Thanks for your reply.
>       But how do I get termfreq of that term("emp-id")? Does Lucene have
> any
> other way to handle this?
>       I appreciate any solution regarding this problem.
>
> Regards
> SenthilKumaran
>
>
> On 6/20/07, Liu_Andy2@emc.com <Liu_Andy2@emc.com> wrote:
> >
> > You are right!
> > "emp-id" will be separated to two terms  CONTENT:"emp"  CONTENT:"id" by
> > standard tokenizer for indexing and searching. But direct writing term
> > (CONTENT:"emp-id") will not.
> >
> > Andy
> >
> > -----Original Message-----
> > From: SK R [mailto:rsk.sen@gmail.com]
> > Sent: Wednesday, June 20, 2007 5:24 PM
> > To: java-user@lucene.apache.org
> > Subject: zero termfreq for some search strings with special characters
> >
> > Hi,
> >     I'm using standard tokenizer for both indexing and searching
> > process.Myindexed value is like "emp-id Aq234 kaith creating document
> > for search".
> >     I can get search results for the query CONTENT:"emp-id" by using
> > hits =
> > indexSearcher.search(*query*).
> >     But if I try to get termfrequency of that term (CONTENT:"emp-id") by
> > using indexreader.termdocs(new Term("CONTENT","emp-id")).freq() , 0
> > results
> > returned.
> >     I think because of the analyzer I can get result in 1st case but
> > absence
> > of analyzer I can't get result in 2nd case (term freq). Is it right?
> >     How do i get correct term frequency for that term?
> >
> >
> > Thanks & Regards
> > RSK
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message