lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: query syntax problem
Date Mon, 07 Aug 2006 18:50:43 GMT
When you say "we've tried the whitespace analyzer", did you mean for BOTH
indexing and searching? If you ony use it for one of those, you'd see
results like this.

And do you use Luke? It'll let you examine your index and see what's
*actually* in it. It's the first place I go when I don't get results I
expect....

See: http://www.getopt.org/luke/

What about capitalization? Lucene is case-sensitive. Some of the analyzers
automatically lower-case and some don't.

If you're using the whitespace analyzer, I don't think you need to bother
transforming the hyphen into underscore....

Hope this helps, without more context I'm not sure what else to suggest...

Erick

On 8/7/06, Yiqun Eddie Cao <cao.yiqun@gmail.com> wrote:
>
> Hi,
>
> We are using lucene in a chemistry database, and we are dealing with
> special
> words containing both digits and characters in English alphabets, such as
> PFC-0234. To prevent lucene from cutting the word into two, we have
> replaced
> all dashes into underscores, so PFC-0234 is stored and indexed as PFC_0234
> in the lucene index. However, none of them works for searches containing
> wildcard characters. For example, none of the following works: PFC_*,
> PFC*,
> PF*, PFC_0*, PFC_02*, but PFC-0234 works. Can anyone tell me what is wrong
> here? We have tried WhitespaceAnalyzer, but it's not working either.
>
> Thanks,
>
> Eddie
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message