lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Groschupf ...@media-style.com>
Subject Re: Can I Do Reverse Search?
Date Sun, 23 Oct 2005 22:37:01 GMT
two document fields one named positive one called negative
you query have to look somehow like this:
positive: (keyword1 keywordN) AND NOT negative:(keyword1 keywordN)

Am 23.10.2005 um 20:50 schrieb Sam Lee:

> Yes, I thought of that.  But since the ads have
> negative keywords, it's very possible for the webpage
> to match the ads but not the other way around because
> of the negative keywords.  So the system cannot be
> sure that the ads match the webpage until it uses ads'
> keyword and negative keywords to rematch the webpage.
> This is a lot of resource for having each ad to match
> the webpage again..
>
> webpage www.mysite.com --match--> ad1.....ad101
>
> Then I match each ad with the webpage.
> But due to negative keywords:
> ad1....ad100 --NOT match--> www.mysite.com
> ad101 --match--> www.mysite.com
>
> # of queries = 102
>
> If there is a way to match content with boolean
> keywords, then # of query is 1 only.  Huge difference!
>
> Any idea how to accomplish this?
>
> --- Stefan Groschupf <sg@media-style.com> wrote:
>
>
>> Index the keywords of your ads with lucene.
>> Extract all words from your page (ajax), remove stop
>> words, build a
>> query from the page words by connect the words with
>> OR and you will
>> find the best matching ad.
>> You may need to limit the words per page or set the
>> maximum clauses
>> to a much higher number.
>> HTH
>> Stefan
>>
>> Am 23.10.2005 um 18:39 schrieb Sam Lee:
>>
>>
>>> ok, I am implementing a google
>>>
>> adsense/adwords-like
>>
>>> system.  For examples, the website has keywords
>>>
>> "nike
>>
>>> red shoe", so it can match text ad with keywords
>>>
>> "nike
>>
>>> shoe -blue".  Of course, I can always use the text
>>>
>> ad
>>
>>> keywords to match the website's keywords.  But it
>>>
>> will
>>
>>> take too much resource to have all ads to rematch
>>>
>> the
>>
>>> new websites whenever new websites joins the ad
>>> network.  So I need a way for the new websites to
>>> "reverse match" the text ads.
>>>
>>> So if new website has "nike red shoes" as
>>>
>> keywords,
>>
>>> then it should match all text ads with "nike shoes
>>> -blue".  The only difference is that it is doing
>>>
>> it in
>>
>>> reverse.
>>>
>>> many thanks.
>>>
>>>
>>> --- Erik Hatcher <erik@ehatchersolutions.com>
>>>
>> wrote:
>>
>>>
>>>
>>>
>>>> Sam - I'm not quite sure I follow you, but let's
>>>>
>> see
>>
>>>> if this fits...
>>>> you want to have a document and see if a query
>>>> matches it?  Please
>>>> elaborate more on what you're after.  Maybe what
>>>> you're looking for
>>>> is the contrib/memory and the MemoryIndex within
>>>> that Subversion area.
>>>>
>>>>      Erik
>>>>
>>>>
>>>> On 22 Oct 2005, at 18:54, Sam Lee wrote:
>>>>
>>>>
>>>>
>>>>> Hi,
>>>>>   Normally, lucene or Nutch can match query
>>>>>
>> "nike
>>
>>>>>
>>>>>
>>>> shoe
>>>>
>>>>
>>>>> -blue" with "red nike shoe".
>>>>>
>>>>> But what about matching "red nike shoe" with
>>>>>
>> query
>>
>>>>> "nike shoe -blue"?  It is the other way around.
>>>>>
>>>>>
>>>> Can I
>>>>
>>>>
>>>>> do it with a combinations of API?
>>>>>
>>>>> Many thanks.
>>>>>
>>>>>
>>>>>
>> __________________________________________________
>>
>>>>> Do You Yahoo!?
>>>>> Tired of spam?  Yahoo! Mail has the best spam
>>>>>
>>>>>
>>>> protection around
>>>>
>>>>
>>>>> http://mail.yahoo.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
> ---------------------------------------------------------------------
>
>>>
>>>
>>>>> To unsubscribe, e-mail:
>>>>>
>>>>>
>>>> java-user-unsubscribe@lucene.apache.org
>>>>
>>>>
>>>>> For additional commands, e-mail:
>>>>>
>>>>>
>>>> java-user-help@lucene.apache.org
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
> ---------------------------------------------------------------------
>
>>>
>>>
>>>> To unsubscribe, e-mail:
>>>> java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail:
>>>> java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>> __________________________________
>>> Yahoo! Mail - PC Magazine Editors' Choice 2005
>>> http://mail.yahoo.com
>>>
>>>
>>>
>>
>>
> ---------------------------------------------------------------------
>
>>> To unsubscribe, e-mail:
>>>
>> java-user-unsubscribe@lucene.apache.org
>>
>>> For additional commands, e-mail:
>>>
>> java-user-help@lucene.apache.org
>>
>>>
>>>
>>>
>>>
>>
>>
>>
> ---------------------------------------------------------------
>
>> company:        http://www.media-style.com
>> forum:        http://www.text-mining.org
>> blog:            http://www.find23.net
>>
>>
>>
>>
>
>
>
>
>
> __________________________________
> Yahoo! Mail - PC Magazine Editors' Choice 2005
> http://mail.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>

---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message