lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terry Steichen" <te...@net-frame.com>
Subject Re: Tags Screwing up Searches
Date Sat, 19 Oct 2002 03:56:35 GMT
I tested against the phrase in my text, '<b>men's college soccer</b>',
matching successfully on 'college AND soccer*'.  However, I found no match
for 'college AND soccer', 'college AND soccer<*', 'college AND soccer<',
'college AND soccerb', 'college AND soccerb*', or 'college AND soccer/'.

Regards,

Terry

---- Original Message -----
From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Friday, October 18, 2002 9:32 PM
Subject: Re: Tags Screwing up Searches


> Is it possible that the Analyzer is stripping <, >, and / characters
> and leaving you with terms like: bCollege and Soccerb ?
>
> Otis
>
> --- Terry Steichen <terry@net-frame.com> wrote:
> > Some content I'm indexing contains certain HTML tags, like <p>, <b>,
> > <i>, etc.  What I find is that when a term I'm searching for touches
> > one of these tags (which is fairly typical), the term isn't
> > recognized and the search fails.  For example, <b>College Soccer</b>
> > doesn't match on either "college" or "soccer".  I seem to recall
> > someone else bring up a similar problem with a word that ends a
> > sentence (and is thus treated as if the period was part of the word),
> > but don't recall what the response was and I can't find that thread.
> >
> > Does anyone have some ideas on what's the best way to handle this?
> > Filter out the tags in the process of creating the Document for
> > indexing? Or through a modification to the Analyzer (I'm using the
> > StandardAnalyzer)? Or something else?
> >
> > TIA,
> >
> > Terry
> >
> >
>
>
> __________________________________________________
> Do you Yahoo!?
> Y! Web Hosting - Let the expert host your web site
> http://webhosting.yahoo.com/
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message