lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Wildcard searches????
Date Fri, 05 Feb 2010 19:42:52 GMT
Yes.  I think you have it.

To explain in a bit more detail, I think that you should store a tokenized
form of the user agents and should query using a tokenized form of your user
agent.  This will retrieve documents that have partial matches to the user
agent of interest.  Many of these matches, however, may not meet the
requirements of the wildcard expression in the documents.  As such, you will
need to look at each retrieved document to retrieve the wild expression from
each one in turn to test if the original (untokenized) query satisfies the
wildcard.

If your wildcards are all of a positive nature as your example is, then this
should work pretty well.

On Fri, Feb 5, 2010 at 9:09 AM, Niclas Rothman <niro@lechill.com> wrote:

> Hi Ted and thanks for all your efforts.
> Listen im a little bit lost here trying to understand what you are trying
> to tell me :-)
>
> 1. I Store my useragents in a field that is tokenized.
> 2. Then when I search, you are saying that I should "scan" down the matches
> via a SOLR function, or what?
> Are you referring to these functions in SOLR?
>
> http://wiki.apache.org/solr/FunctionQuery
>
>
> Sorry for not grasping immmediatley!
>
> Regards Niclas
>
> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: 05 February 2010 17:44
> To: general@lucene.apache.org
> Cc: java-user@lucene.apache.org
> Subject: Re: Wildcard searches????
>
> Tokenize your user agent strings, then store the tokenized form separately
> from the wild card.  At retrieval time, scan down the matches and apply the
> wildcard from each document to your original query.  The SOLR function
> query
> might be useful for this as would be a custom hit collector.
>
> On Fri, Feb 5, 2010 at 7:57 AM, Niclas Rothman <niro@lechill.com> wrote:
>
> > Hi there, i facing a problem and would like to ask the community for some
> > help.
> >
> > In my index I store browser  useragent values as "wildcarded" / partial,
> >  which should be understood that an indexed document
> > should only be shown to end users if his browsers useragent matches a
> > wildcared usereragent in my document.
> >
> > So what I have Is actually a "reversed" matching, the wildcards are in my
> > document and NOT in my actual query.
> > Does anyone know if this "setup" Is possible, e.g. to execute a query in
> > style with:
> >
> > useragents:
> >
> "Mozilla/4.0+SonyEricssonC905v/R1DE+Browser/NetFront/3.4+Profile/MIDP-2.1+Configuration/CLDC-1.1+JavaPlatform/JP-8.4.1+UP.Link/6.3.1.20.0"
> >
> > In this example I would have a hit because Mozilla/4.0* matches the
> > useragent.
> >
> > <doc>
> > <useragents>
> >                Firefox*
> >                Mozilla/4.0*
> > </useragents>
> > </doc>
> >
> >
> > Regards
> > Niclas
> >
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>



-- 
Ted Dunning, CTO
DeepDyve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message