lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gauri Shankar" <gshankar.s...@gmail.com>
Subject Re: Exact match query on a field in index which has been indexed using StandardAnalyzer
Date Wed, 14 May 2008 05:14:43 GMT
Thanks Erick...

My index size is  ~2 GB. Is it a good idea to keep another duplicate field
as UN_TOKENIZED and search using KeywordAnalyzer?

Few points:
1. when I say exact match then I mean the exact phrase match only. That
implies the query should not match a document with the field value "Member
of Technical Staff for Accounting".

2. stopwords are fine. So "Member Technical Staff" would be equivalent to
  my query, and is acceptable.

Thanks,
Gauri

On Tue, May 13, 2008 at 7:31 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> First, why do you OR together the different cases? Assuming
> you're pushing your query through StandardAnalyzer, it'll lowercase
> for you (just as it did during indexing).
>
> But to your question. Would you expect your query to match a document
> with the field value "Member of Technical Staff for Accounting"? because
> your phrase would match this as well.......
>
> Not to mention stop words. "Member Technical Staff" would be equivalent to
> your query, is that acceptable?
>
> Probably the simplest way would be to index the field UN_TOKENIZED too
> (perhaps normalizing it by, say, lower casing it and taking out stop
> words)
> and search against the duplicate field with KeywordAnalyzer. Yes it'll
> cost
> you some space but disk space is cheap <G>. How big is your index anyway?
>
> I suppose you could also STORE the field and take any documents returned,
> get the field and compare, but then stop words, casing, etc are tricky.
>
> What is the use case you're trying to support? There may be other ways
> to accomplish your goal than messing around with exact matches.
>
> On Tue, May 13, 2008 at 9:43 AM, Gauri Shankar <gshankar.sahu@gmail.com>
> wrote:
>
> > Hi,
> >
> > I have a field in index which has been indexed using StandardAnalyzer
> and
> > as
> > TOKENIZED. Now I would like to write a query which returns the hit if
> > there
> > is a exact match on the field value.
> >
> > Say, if field value is : Member of Technical Staff
> > then "member of technical staff" OR "Member of Technical Staff" should
> > returns this document. Note that I am writing the phrase query for
> > matching
> > the query string.
> >
> >
> >
> > --
> > Regards,
> > Gauri Shankar
> >
>



-- 
Regards,
Gauri Shankar

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message